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Abstract 

o 

Let M be a random mxn matrix with binary entries and i.i.d. rows. The weight 
(i.e., number of ones) of a row has a specified probability distribution, with the row 
chosen uniformly at random given its weight. Let N(n, m) denote the number of 
left null vectors in {0, l} m for M (including the zero vector), where addition is mod 



2. We take n, m — > oo, with m/n — > a > 0, while the weight distribution may vary 
with n but converges weakly to a limiting distribution on {3, 4, 5, . . .}; let W denote 

i 1 a variable with this limiting distribution. Identifying M with a hypergraph on n 

vertices, we define the 2-core of M as the terminal state of an iterative algorithm 

^"J that deletes every row incident to a column of degree 1. 

^ We identify two thresholds a* and a, and describe them analytically in terms 

of the distribution of W. Threshold a* marks the infimum of values of a at which 
n _1 logE[AA(n, m)] converges to a positive limit, while a marks the infimum of 
values of a at which there is a 2-core of non-negligible size compared to n having 

t-h more rows than non-empty columns. 

^ We have 1/2 < a* < a < 1, and typically these inequalities are strict; for 

1/^ example when W = 3 almost surely, numerics give a* = 0.88949 . . . and a = 

0.91793 . . . (previous work on this model has mainly been concerned with such cases 
where W is non-random). The threshold of values of a for which N{n,m) > 2 in 
probability lies in [a*, a] and is conjectured to equal a. 
^vq The random row weight setting gives rise to interesting new phenomena not 

t-h present in the non-random case that has been the focus of previous work. 

> 
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1 Introduction 



Suppose that M := M(n,m) is an m x n matrix with entries in {0, 1}, each of whose 
rows contains at least one 1, for which we seek a left null vector over GF[2], i.e. a row 
vector a G {0, l} m such that aM = (mod 2), where here and elsewhere is the all-0 
vector. More generally, elements of M might belong to the finite field GF[g] of order q. 
We are interested in the case where M is sparse and random, as specified below. 

Let Xl,X 2 ,. . . ,X m denote the vectors constituting the rows of M, and let <r(n, m) 
denote the co-rank over GF[2], namely 

a(n,m) := m — dimspan{Xi, X 2 , . . . ,X m }, (1.1) 

where here and subsequently 'span' indicates the linear span over GF[2]. Then the number 
of null vectors of M, including the zero vector, is 

Af{n,m) = 2 CT(n ' m) , (1.2) 

which counts the number of distinct solutions in {0, l} m , including the zero solution, to 

a 1 X 1 + --- + a m X m = (mod 2). (1.3) 

Note that for a fixed n and a given realization of the sequence of rows Xi,X 2 , . . ., the 
numbers J\f(n, m) are nondecreasing as m increases. 

Suppose that n,m — > oo, with m/n — > a > 0. Our goal is to examine the limit- 
ing behaviour of the expected number E[jV(n,m)] of left null vectors, and the limiting 
probability F[a(n,m) > 0] of a mod-2 linear dependency of the rows of M(n,m), as a 
function of the parameter a, and especially to derive computable thresholds at which 
phase transitions occur. We also study the rate of exponential decay of the probability 
that 1 := (1, 1, . . . , 1) is a null vector. 

The probabilistic setting that we consider has the rows Xi,X 2 ,. . . ,X m being inde- 
pendent and identically distributed (i.i.d.) with the law of a random vector X = X[n) e 
{0, l} n . The problem has different flavours depending on the underlying law of X, and 
several regimes have received considerable attention in the literature, including: 

(a) The dense regime in which X has order n non-zero components; the standard model 
studied in this regime has X distributed uniformly over {0, l} n . 

(b) The classical sparse regime in which X has order logn non-zero components. 

(c) The uniformly (very) sparse regime in which X has 0(1) non-zero components. 
The main focus of the present paper is regime (c) (albeit our 'O(l)' may be random 



for each row, and might not even have a mean); in Section |2/T below we briefly discuss 
other models that have been studied. In the simplest case, X contains a fixed number 
r < n of non-zero components (the cases r = 1 and r = 2 have distinct behaviour from 
the case r > 3); in more generality the number of non-zero components is randomly 
distributed according to a given weight distribution. 

Before formally describing our model in detail and presenting our main results (in 
Section [2]), we make some remarks on motivation, and on the literature. Note that 
a(n, m) = if and only if M has row rank m, which occurs if and only if M has 
column rank m. Thus the absence of non-trivial left null vectors is equivalent to all 
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column vectors in {0, l} m being expressible as a linear combination of the columns of 
M (with addition modulo 2), or in other words, to there being a solution x e {0, l} n 
to Mx = y for all column vectors y G {0, l} m . In the special case of r = 2, motivation 
for considering this question is discussed at the start of (24l Chapter 3]. The following 
interpretations help to motivate the general case. 

A scheduling problem. Suppose that a tennis club is organizing its annual schedule. 
There are n playing days, and m potential players. Each player wants to play on a given 
subset of the days; if there is not a match available on every one of these days, they 
refuse to pay the annual membership. Each day, in order for nobody to be left out, an 
even number of players is required. Each possible schedule satisfying these requirements 
is a left null vector mod 2; the one with the most units achieves the maximal income for 
the tennis club. 



Randomized Lights Out. This is a variant of the game 'Lights Out' |33|. Each of m 
lamps can be either on or off, and there are n switches, each of which is incident to a 
specified subset of the lamps, as given by the random matrix M; Lamp i and Switch j 
are mutually incident if and only if the entry at of M is 1. If a switch is toggled, 
all of the lamps incident to it have their status changed from on to off or off to on. One 
may ask whether all states (i.e. configurations of on and off lamps) are accessible from 
the 'all off' state by using some sequence of switches (or equivalently, if the 'all off' state 
is accessible from all possible starting states), and this is equivalent to the question of 
whether the column rank of M is m. 



The XORSAT problem. This is a variant of the random satisfiability problem 31 



where there are n Boolean variables which may be deemed true or false. Each row of 
M represents a clause built as the logical XOR (exclusive OR) involving those Boolean 
variables corresponding to columns incident to this row, so the clause is true if an odd 
number of the variables incident to the row are deemed true. Given a vector y e {0, l} m , 
finding a solution x to Mx = y corresponds to finding a truth-assignment for the Boolean 
variables so that each clause % is true if yi = 1 and false if = 0. Thus the column rank 
is m if and only if the problem is satisfiable for all possible choices of y. 



A spin- glass model. The relationship between satisfiability problems and spin glasses 



has already been noted in 31 . In the present instance, consider the following variant 
of the well-known Sherington-Kirkpatrick mean-field spin-glass model (see e.g. [34]). 
There is a random collection of hyperedges on n vertices, represented by the m rows 
of M. Each hyperedge i has a sign taking value (— l) Vi . Each vertex j is assigned 
a spin Oj taking values in { — 1, 1}, and (at zero temperature) the probability measure 
on the state-space is concentrated on states of minimal energy, i.e. with maximal value 
of Yl,9i e ii where here denotes the product of spins at vertices in hyperedge %. The 
existence of a configuration with all terms in the sum equal to +1 is equivalent to the 
existence of a solution to Mx = y. 



The Ehrenfest urn model and the random walk on the hypercube. In the Ehrenfest 
model of heat exchange, a box contains n particles, some of which are red and the rest 
blue. At each step, a particle is sampled uniformly from the box and changes its colour. 



For a sample of the large literature, see e.g. [lm p. 121], pM §3.5], [32j §3.5], or 21 
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In the case where X has a single unit entry, we may view each row of M as selecting 
which particle is to be changed at that step. Then 1 is a null vector for M if and only 
if the model returns to the initial state after m steps. This may also be interpreted as 
a random walk on the graph whose vertices are {0, l} n and edges are present between 
those vertices that differ in a single component; the event that 1 is null corresponds to 
the walker being back in his starting state after m steps. 

The general case, allowing other weight distributions, corresponds to a generalization 
of the Ehrenfest model whereby multiple 'diffusions' are allowed, i.e. at each step 



several particles may change colour at once; cf 30, Chapter 10]. This can be similarly 



interpreted in terms of a walk on a version of the hypercube with additional edges. 

There is a large body of work on the properties of random matrices over finite fields 
and the closely related subject of random linear equations over finite fields. Surveys are 



provided by the book 24, Chapter 3] as well as the articles [28,29 . The problems may 
also be formulated in terms of random hypergraphs: each row represents a hyperedge, and 
each column represents a vertex; for details see Section [5] below. They are also related 



to the XORSA T problem in Boolean algebra, as mentioned above (see also Section 2.6.2 



below). Generally, such models can be described in the framework of random allocation 



or occupancy problems: see the books 21 24 , 26 , 30 



The null-vector problem in the fixed row-weight case has received several treatments 
in the literature, and it is not easy to reconcile all of the existing results, due to differ- 
ences in presentation and also differences in the underlying probabilistic models. One 
contribution of the present paper is to clarify some of these issues, including giving a 
rigorous justification that the results are unchanged under small perturbations of the un- 
derlying model. Our main contribution, however, is to treat the case of genuinely random 
row weights, which has not previously been studied. We mention that there has recently 
been renewed interest in this area in several scientific communities: for example, Alamino 
and Saad [l] give a statistical physics approach to the null-vector problem; Ibrahimi et 



al. 19 treat the related problem of random XORSAT; Costello and Vu [9] study the rank 
of random symmetric (so in particular, square) matrices. 

Throughout the paper, we extend the function x h- > x x , x > 0, continuously to x = 0, 
so that 0° := 1. We define the weight of a vector v = (vi, . . . ,v n ) G {0,1}" to be 
w(v) := Y^i=i v ii i- e -' the number of unit entries. For n G N := {1,2,...} we write 
[n] := {1,2,..., n}. We write — ^ for convergence in distribution. 



2 Results and discussion 

2.1 Description of the random matrix model 

Given n G N, suppose that X = X(n) G {0, 1}™ is a random row vector, selected 
according to some probability law on {0, l} n . Consider a sequence of i.i.d. random vectors 
X%, X%, . . . with the same law as X. Let M := M(n, m) be the m x n matrix whose rows 
are X\, X2, . . . , X m . 

We will consider X with law of the following form. Let W be an N-valued random 
variable (so ¥[W > 1] = 1) whose law will be the (limiting) weight distribution of our 
random vector X. Let Wi,W%, . . . be a sequence of random variables with W n G [n] 

such that W n — W as n — > 00. Let w(X) have the distribution of W n , and for 
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each k G [n] let the conditional distribution of X, given w(X) = k, be uniform over 
{x G {0,l} n : w(x) = k}. 

Let p(s) := E[s w ] and p n (s) := Efs 147 ™] denote the probability generating functions of 
W and W n , respectively. We use F Pn and E Pn for the probability and expectation for the 
random matrix model with n columns and row weight distribution given by p n . We shall 
say W n are uniformly bounded if there is a finite constant r x such that P[W n < r%] — 1 
for all n (and hence F[W < ri] = 1 as well). 



2.2 Threshold results in the general setting 

Given the probability generating function p, define the threshold 

a* := inf{a > : F p (a) > 0}, (2.1) 

where we set 

F»:=log sup ~^ g ), a>0. (2.2) 

7 e[o,i/2] V 2 7 7 (1 -7) 7 / 

We state some fundamental properties of FJa) and a* in the next result, which we 



prove in Section 4.2 



Proposition 2.1. We have F p (a) = for < a < a* but F p (a) > for a > a*, and F p 

is continuous and nondecreasing as a function of a. Moreover: 

(i) a* e [0, 1]; and a* < 1 ifE[W] < oo. 
(U) a * = o if and only if¥[W = 1] > 0. 
(Hi) If¥[W = 2] = I, then a* = 1/2. 

(iv) Suppose that W is another N -valued random variable, with p(s) = E,[s w ], such that 
p{s) < p{s) for all s G [0,1] (which is the case, for example, if W stochastically 
dominates W). Then a* p > a*. 

In particular, ifF[W > 2] = 1 and E[W] < oo, then a* G [1/2, 1). 

Here is our first main result, describing the threshold behaviour of the expected num- 



ber of null vectors. We shall prove this in Section 4.5 



Theorem 2.2. Suppose that m n /n — > a G (0, oo) as n — >■ oo. Then 

lim n- 1 logE p J./v>,m n )] = F p (a); (2.3) 



in particular, the limit in (2.3) is strictly positive for a > a* p and zero for a < a* p . 
Moreover, if in addition there exist ro > 3 and r\ < oo such that P[ro < W n < ri] = 1 
for all n, and a G (0, a*), then as n — )■ oo, 

E Pn [N(n,m n )\ = 1 + O(n 2 - ro ). (2.4) 



6 



Remark. For a < a*, the expectation in (2.4) is dominated by low-weight null vectors 



(short hypercycles) , in particular by null vectors with only 2 non-zeros. It may be possible 



by extending the argument of Lemma |4.3| below, to expand this expectation as a power 
series in n~ l , but we do not pursue this here. For a > a* the exponential rate in (2.3) is 



dominated by null vectors using a specific positive proportion (3 = /3 (a) of the (roughly 
an) available rows (and also possibly those using proportion 1 — f3 of the rows, due to 
parity effects); in fact, /3 = x ^7 2 270) G (0, 1/2), where 70 = 70 (at) G (0, 1/2) is the value 



of 7 for which the supremum in (2.2) is attained. See Section 4.5 for details. 



Theorem 2.2 deals with the expected number of null vectors. Also of interest is, for a 
fixed n, the (random) number of rows m at which the first non-zero null vector appears. 
Define 

T n := min {m G N : X m G span {X l ,X 2 , X m _i}} , (2.5) 

the first m for which rank(M(n, m)) < m. Standard linear algebra implies that T n < n+1. 

We define another threshold, a p , through an analytic description that needs more 
notation (we shall give a probabilistic interpretation later on). For x G (0, 1) set 



ip(x) :- 
h(x) :- 



x+ II 
loefl 



p'{x) 
x) 



— x ) log(l — x); 



p'{x) 



(2.6) 
(2.7) 



Provided F[W > 1] = 1 we can and do extend ip continuously to ip(0) : = 0, since 
p(s)/p'(s) = O(s) as s 4- 0. Note that h(x) — > 00 as x J, provided F[W > 3] = 1, and 
that if K[W] < 00 then as x 7 1 we have h(x) — > 00 and ip{x) — > —00. Set 



a" 



inf h(x), 

x6(0,l) 



(2.8) 



and note that ar p p'{x) < — log(l — x) for all x G (0, 1), so integrating from to 1 we get 
ct\ < 1, provided F[W > 1] = 1. For a > 0, define 



g*(a) := sup{x G (0, 1) : h(x) < a}, 



(2.9) 



with the convention sup0 = in operation in ( |2.9 ). Observe that if h has unbounded 
range (e.g. if P[W > 3] = 1) then h o g* is the identity map on [afi p , 00). See Figure [l] for 
an example. Define 

a p := inf{a > aj : ip(g*(a)) < 0}. (2.10) 



In (2.10), the set defining gt p is non-empty provided ¥[W > 3] = 1 and E[W] < 00, since 
as a — > 00 we have g*(a) —> 1 and ip{g*{a)) —> —00. 

The relevance of a p for the null vector problem is shown by the next result. 

Theorem 2.3. Suppose W n are uniformly bounded and P[W n > 3] = 1 for all n. Then 
Q-p< ot p < 1, and for any e > 0, 



lim F Pn [(a* p - e)n <T n < 



a, 



:)n] = 1. 



(2.11) 



Theorem 2.3 is proved in Section 5.3 We exclude the case where ¥[W G {1,2}] > 



from the statement of Theorem 2.3 different phenomena occur in that case (see Proposi- 
tion 



2.9 below). This case is also discussed in [10], where the functions ip and h also play 



a role. 
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Typically, a p defined by (2.10) will satisfy il>(g*(a p )) 



single solution in (0, 1), denoted x* to ip(x) = 0, and g p 



= 0. In many cases, there is a 
h(x*). However, the situation 



is complicated by the fact that a i— > g*(a) typically has at least one discontinuity We 
defer a more detailed discussion of the properties of the functions ip, h, and g*, and the 
corresponding thresholds, to Section ^3 below. Figure [T] provides an example. 



1.00 



0.98- 



y 0.96- 



0.92- 



0.05- 



0.04- 



0.03- 



0.02- 



Figure 1: Example with p(s) = 0.9s 3 + 0.1s 24 . The left plot shows parts of the curves 
y = h(x) (all the line) and x = g*(y) (solid line). The right plot shows parts of the curves 
y = ip(x) (all the line) and the locus of (g*(a), ip{g*{a))) (solid line). The left plot shows 
that g* (a) has two discontinuities, one at a = or « 0.908654 and one at a ~ 0.938536, 
with the first corresponding to a jump from g* = to g* ~ 0.719682 and the second to 
a jump from g* 0.835696 to g* ~ 0.964919. The right plot shows the single positive 



solution of ip{x) = at x 



x , 



0.987817, so a, 



h{x* 



0.991613. It is not a 



coincidence that the curves h and ip seem to mirror each other: see Lemma 5.9 below. 



Theorem 2.3 leaves open the sharp asymptotics of T n /n: we believe that the upper 
(i.e., a p ) is sharp: 



bound in Theorem 2.3 



Conjecture 2.4. If W n are uniformly bounded and F[W n > 3] = 1 for all n, then T n /n 
converges in probability to g p as n — > oo. 

An equivalent statement to the fixed-weight case W = r > 3 of this conjecture seems 
to have been established recently in the random-XORSAT literature: see the comments 
in Section 12.6.21 

The probabilistic interpretation of the thresholds a p and a p is in terms of the 2- core 
of M(n,m); this is the terminal state of an iterative algorithm that deletes every row 
incident to a column of degree 1 (see Section [5] below for details). Let E(n,m;e) denote 
the event that M(n, m) possesses a 2-core (i) whose number of rows is bounded below 
by en, and (ii) which contains more rows than columns of non-zero weight (all of which 
have weight 2 or more). In particular, for e > 0, E(n,m;e) implies that M(n,m) has a 
non-empty 2-core. If W n are uniformly bounded, and m n /n — > a > 0, then in Theorem 
|5.6| below we will show that, under certain additional conditions on p, there exists e > 
(allowed to depend on a) such that lim^oo F Pn [E(n, m n ; e)] = 1 for a in some interval 
of the form (g p , g p + 5) with 5 > 0. 
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A more delicate description of the behaviour of the 2-core in terms of the function ip 



defined at (]2.6h will be given in Theorem 5.6 below: the limiting aspect ratio of the 2-core 



being less than or greater than 1 depends on the sign of ijj(g*(a)). In the example shown 
in Figure [TJ and also in the fixed weight setting, ?p(g*(a)) changes sign only once, but in 
the the general random weight setting it may change sign multiple times; see the example 
in Figure [3] below. Thus the random weight setting gives rise to subtle new phenomena 
not present in the fixed weight case that has been the focus of previous work. 

Remark. Our techniques may be used to obtain information about the weight profile 



of the left null space. Bayes' Theorem applied to equation (|4.2|) below shows that the 

chosen left null vector h 

F Pn [w(v) = k | vM = 0] 



weight of a randomly chosen left null vector has distribution given by 
the quantities on the right-hand side here are studied in detail in Section [3j 



2.3 Even occupancy in random allocations 

Let A(n,m) denote the event that the row vector 1 = (1, . . . , 1) is null for M, i.e., 

A{n, m) := {X l + • • • + X m = (mod 2)}. (2.12) 

One interpretation is in terms of the random allocation model. Suppose we have n urns, 
and for each row of M we allocate a collection of balls to a chosen set of urns (determined 
by the unit entries of that row of M). Event A(n,m) is the event that all the urns end 
up with an even number of balls. Random allocations have been extensively studied; see 



e.g. [16, p. 101], and the monographs |2T}|24j|26}[30] . 

The following theorem, which we prove in Section |3j describes the exponential rate 
of decay for F Pn [A(n,m n )] where m n /n has a finite positive limit. The theorem excludes 
the case in which both W and m n only take odd values; if m is odd and W n is odd a.s., 
then F Pn [A(n,m)] = since the total number of units in the matrix is odd. 

Theorem 2.5. Suppose that m n /n — > a G (0, oo) as n — > oo, and that either (i) m n G 2Z 
for all n; or (ii) F[W G 2Z] > 0. Then 



lim n 1 logF Pn [A(n, m n )\ = -R p (a), 

71— >OG 

where R p (a) > is continuous and nondecreasing in a > and is defined by 

R p (a):= -log sup ( {P{ ] ~ 27 5 
PV ' \ e[ o,i/2] V27 7 (l-7) 1 ^ 



(2.13) 



(2.14) 



The relevance of Theorem 2.5 to Theorem 2.2 is clear; see the formula (4.3) below. 



We present an interesting special case of Theorem 2.5 that can be understood in isol- 
ation. Let 7r n (m) denote the probability that all the components Yj of a multinomial 
(m; n -1 , . . . , n~ l ) random vector (Yi, . . . ,Y n ) are even. Here Yj can be interpreted as 
the occupancy of urn j after m balls are independently and uniformly distributed into n 
distinct urns: see e.g. Q p. 23], (30) p. 11], or [2lj p. 90]. 

Then n n {m) = 2~ n ^" =0 ( n ) (1 — (2j,M)) m ; this formula is known in the Ehrenfest 
urn literature (see 30, pp. 128-129]) and can also be obtained from (3.1) below. If m is 



odd, 7r n (m) must be zero. 
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Proposition 2.6. Let ir n (m n ) denote the probability that all the n components of a mul- 
tinomial (m n ; n -1 , . . . , n _1 ) random vector are even. Suppose that m n is even for each n 
and m n /n — > a = A tanh A G (0, oo) as n — > oo. Then 



lim n 1 log7r n (m n ) = log cosh A — (A tanh A) (1 — log tanh A). 



(2.15) 



This result follows from Theorem 2J3 as we shall show in Section [3.5| Proposition 



2.6 can also be derived from a result of Kolchin 23, Theorem 2, p. 141]. 



2.4 The fixed row- weight case 



We now describe the special case of the results in Section 2.2 when ¥[W = r] = 1, for 
some fixed r. The existing literature is largely concerned with this case (see the discussion 
in Section 2.6 below). 

Let r G N. Define the thresholds a*, a*, and a r to be the values of a*, a p , and g p , 
respectively, in the case where F[W = r] = 1 (i.e. where p(s) = s r ). By (2.1) we have 

a* r := inf{a > : F r (a) > 0}, 

where 



F r (a) := log sup 

7 e[o,i/2] 



;i + (i-2 7 n c 

2 7 7(1 - 7 )i-7 



(2.16) 



By Proposition |23j F r (a) = for a < a*, but F r (a) > for a > a*. If r > 3, ip(x) = 

(2.17) 



has a single solution in (0, 1) (see Proposition 5.8 below) denoted x* and satisfying 

- 1 



- 1 ,,. 



1 



and 



a, 



h(x* r 



X *r log(l -<), 



log(l - xt) 



r(xt) r 1 



(2.18) 



For example, a\ w 0.818469, g*(a{) w 0.715332, and x% w 0.883414 so a 3 w 0.917935 



(see also Figure [4] below). The next result is a specialization of Theorems 2.2 and 2.3 
together with Theorem |5.6 and Proposition 5.7 



Theorem 2.7. Let r G N. Suppose that W n — >■ r m probability. Suppose that m n /n — > 
a G (0, oo) as n — >■ oo. T/ien 



lim n 1 logE Pn [A/'(n,m n )] = F r 



a: 



(2.19) 



in particular, the limit in (2.19) is strictly positive if and only if a > a*. If also f[W n > 
3] = 1 (so r > 3) and W n are uniformly bounded, then a* < g r < 1 and for any e > 0, 

lim P Pn [(a* - e)n <T n < (g r + e)n) = 1, 

n— >oo 

and moreover, there exists e a > such that for all e G (0,e a ), 



lim F Pn [E(n,m n ;e)} 



if a < g r 

1 if a > a r . 



(2.20) 



10 



r 


1 


2 


3 


4 


5 


6 


7 


8 


a\ 





0.5 


0.818469 


0.772280 


0.701780 


0.637081 


0.581775 


0.534997 


a* 





0.5 


0.889493 


0.967147 


0.989162 


0.996228 


0.998650 


0.999510 


a r 






0.917935 


0.976770 


0.992438 


0.997380 


0.999064 


0.999660 



Table 1: Threshold parameters for r- uniform random hypergraphs. Note that a r is not 
defined when r = 1 or 2. 



The sharp monotone transition displayed by (2.20) in the fixed weight case was in- 



dicated by Cooper [8j. In the general, random weight setting, the picture can be more 
complicated, and there exist examples where the transition is non-monotone: see the 
example in Figure [3] below. 

Remark. In the case r = 2, compare Theorem 2.7 to Proposition 2.9[ ii) below: the first 
cycle in the random graph appears at m n = Zn, where Z has an asymptotic distribution 
on (0,1/2). It is a classical result that for a G (0,1/2), if m n /n — > a, the number of 
cycles jV(n, m n ) in an Erdos-Renyi graph has a Poisson limit with finite expectation, but 
the limiting expectation is infinite for a > 1/2 (see e.g. 24, §2.3]); we could not find in 



the literature an explicit reference to the fact that the expectation blows up exponentially 
with n for a > 1/2, at the rate given by Theorem 2.7 (a classical result of Erdos and Renyi 
states that at a = 1/2, the expected number of cycles grows as \ logn: see Theorem 5a 
of p) p. 41]). 



2.5 Threshold numerics and asymptotics 

In this section we discuss numerical and asymptotic evaluation of the thresholds in our 

Table \\\ shows values of at, 



2.4 



results for the fixed row-weight case described in Section 
a*, and g r , for r < 8. Previous computations of these thresholds are reviewed in Section 
As suggested by the numerical results, it can be shown that, for r large enough, 



2.6 



a\ < a* < a r < 1; this is a consequence of the following result. 
Proposition 2.8. As r — > oo, 



a; 



^0; 



log 2' 



(2.21) 



The asymptotic result for a* in (2.21) is due to Calkin |5|; we prove the other two 



parts in Section [575] below. 

One can obtain arbitrarily sharp upper and lower bounds for the solution x* G (0,1) 
of i/;(x) = in the case p(s) = s r , r > 3, as follows. In this case, by (2.17) we have that 
i r (x*), where we set 



x 



lr\X) 



cxp 



X 



1 - 



)x 



>> X ^ l-8x 



For 6 G [0, 1 

that x* > a n+ i := i r (a n ). Also, i' r (0) = 1, 'C(O) 
i r (x) > x for x G (0, x*) but i r (x) < x for x G (x 



is strictly increasing for x G [0, 1]. Thus if x* > a n , it follows 

-- ^f- > 0, and i r (l) = 1 - e~ r < 1, so 
1]. Hence starting with ao = ^5? < x * 



(an inequality proved in Proposition 5.8 below), we can iterate to obtain an increasing 
sequence of lower bounds a n for x*. Conversely, starting instead with b = 1 > x* 
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and iterating 6 n+1 := i r (b n ) gives a decreasing sequence of upper bounds b n for x*. For 
example, after one step we get 



exp 



r(r — 2) 
2(r - 1) 



< xl < 1 



(r > 3). 



Proceeding up to 62 for the upper bound and 04 for the lower bound is sufficient to obtain 
the r — > 00 asymptotic expression 



r 2 e 



2r 



" + 0(r 4 e 



-3r\ 



(2.22) 



which will be the main ingredient in the proof of the a r result in (2.21). 



In fact, this iterative procedure converges, so a n |" x* and b n I x*. To prove conver- 
gence it is sufficient to show that i' r (x) < 1 at x = x*. A calculation shows that i' r (x) 
evaluated at x = x* comes to x~ 2 (l — x)(log(l — x)) 2 , so for the required inequality it 
suffices to show that 



—x 1 log(l — x) < (1 — 



x) 



-1/2 



for < x < 1. 



The coefficient of x k in the power series expansion of the left-hand side of the last inequal- 
ity is l/(k + 1), and for the right-hand side it is 4 _fc ( 2 fc fc ), and both series are convergent 
on the given interval. An induction shows that l/(k + l) < 4 _fc ( 2 fc fc ) for all integers k > 0. 
So term-by-term comparison of the two power series gives the inequality. 

To end this section we discuss the numerical evaluations of a* in Table [I] In this 
discussion we use some claimed properties of the functions involved that we do not verify 
rigorously, since here we are only concerned with numerical estimation. Let 



(1 + (1 - 2 7 yr 

2 7 7(1 - T )!-7 



(2.23) 



so that ^(7) = sup 7£ [0,1/2] Fr,a(l)- Differentiating, we obtain 
d 7 rAl) l + (l-2 7 y g V 7 



(2.24) 



Thus 7 is a stationary value for F r Q , if 7 solves 



a: 



1 + (1 ~ 2 7 ) 
2r(l - 27)^- 



jlog 



7 



7 



--: a r (7). 



(2.25) 



Numerical curve sketching shows that (2.25) generically has at most 2 solutions in (0, 1/2); 
of such solutions, the smallest will be the local maximum, since F' ra {^) — > 00 as 7 4- 0, 



by (2.24). If (2.25) has no solutions in (0,1/2), then the supremum in (2.16) is either 
F r , a (°) = ( a ~ ^iog 2 or F r , a (l/2) = 0. Thus setting 7o := 7 (a,r) = if (l2~25) has 



no solutions in (0, 1/2) and 70 := 7o(a,r) to be the smallest positive solution to (2.25) 
otherwise, we have that F r (a) = F^^o) whenever a G (0, 1). 

For a G (0, 1), F r (a) > if and only if 7o(a, r) > 0. Moreover, Proposition 2.1 shows 
that a* < 1, so that for a < 1 such that 7o(a, r) > 0, F r (a) = a r (7o) log(l + (1 — 27 ) r ) — 
log(27Q°(l — 7 ) 1-70 ). Thus to find a*, we solve for 7 G [0, 1/2] the equation 



Or (7) -<t>r{l) = 0, 



(2.26) 
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where 

log (2-^(1 -7) 1 " 7 ) 
Ml) = log(l + (l-2 7) .) 



Numerical curve plotting shows that 7 h> 0^(7) — </v(7) is decreasing on [0, 1/2], so (2.26) 
can be solved using efficient numerical methods; let j r denote the solution to ( |2.26[ ). Then 
we compute a* via a* = a r (j r ). 



2.6 Discussion and related results 



2.6.1 Previous results on threshold values 

In the simplest case, W n = W^ yp := r An a.s., for a fixed r e N; then W = r a.s. This fixed 
row weight 'hypergeometric' model is studied by Cooper pi. A variation is the model 
in which r units are assigned to the row independently and uniformly at random, with 
multiplicities reduced mod 2. The latter 'binomial' model corresponds to W n = W^ m 
distributed as the number of odd c omp onents in a multinomial (r; n~ x , . . . , n~ x ) random 

vector; then W n r (see Lemma 3.3 below). The r > 3 binomial model is studied by 
Kolchin |23|. Note that in this model rows of all zeroes may appear, in which case they 
are ignored (in other words, empty hyperedges are discounted): this is a small effect since 
F[W^ = 0] = 0{n~ r/2 ), so a vanishing proportion of rows needs to be discarded. 

Phase transitions in the null vector problem for random matrices over finite fields 
with fixed row weight r > 3 have been studied since the early 1990s. In the case of the 
binomial model, the threshold a*, r > 3, for E[jV(n, m n )), m n /n — > a, was described by 
Balakin et al. [3] and Kolchin [23] (having been announced in [22]); in these results a* r is 
characterized by the fact that the expected number of non-trivial null vectors tends to 
when a < at (a > at), but the proofs show that the growth is in fact exponential 

5 
6 



00 



for a > a* T . Calkin [5] and Cooper[6] also study a*, r > 3, and in particular Calkin 
studies a* as r — > 00; both [6] and [5] work in the case W n = r An. Note that Cooper's 
expression of the matrix problem is transposed compared to ours. The first part of our 



Theorem 2.7 represents a slight generalization of the results just mentioned above because 
it allows for any class of sequences W n provided W n — > r in probability. The case of finite 
fields of order q > 3 has also been studied: see for instance [4l[6 25 . 



In these previous investigations, the analytic description of the threshold a* varies. 
Calkin [5j §4] gives the same description of a* as our equation (2.16). Systems of nonlinear 
equations for computing a* r have been proposed in [3j[6}[23]; these descriptions can be 
shown to be consistent with ours. Specifically, with F r ^ a {^) as given at (2.23), one may 
characterize a* by the two equations F, 



.(7) 



r,a \ 

and ^-F r)Q ,(7) = 0. On the substitution 
the first of these equations becomes, after some calculations along the 



A = >g(±, 

lines of those in the proof of Proposition |2.6| below 



1 + (tanh A) r ) a e 



r\«„- Atanh A 



cosh A = 1. 



(2.27) 



The second equation, involving the vanishing of the derivative given at (2.24) gives, after 
the same substitution for A, 



ra = (1 + (tanh A) r )AtanhA. 



(2.28) 



P- 



The system of equations (2.27) and (2.28) is the same as that given by Cooper [6 
269], and, after some manipulation, is seen to coincide also with that given by Balakin 
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et al. 3, p. 564] and Kolchin 23, p. 139]. Despite this agreement, there are some small 
discrepancies in the numerical evaluations for a* in [3j[5j[6 23 , which can presumably be 
put down to numerical inaccuracies. 

A similar tabulation to our tabulation of a r is given by Cooper (8l pp. 370-371], who 



also gives an equivalent analytic description of ay to our (2.17); sec also Dictzfclbingcr et 
al. 13 which we discuss further in the next subsection. We note also that al has received 



considerable attention in its own right: see e.g. 19 for its role in random XORSAT 



2.6.2 Between the two thresholds 

The following problem arises in the XORSAT literature. Let r G N with r > 3. Let M 
be our m x n matrix, with m/n — > a > 0, and suppose W n = r a.s. for all n > r. Let N 
denote the number of column vectors x G {0, 1}™ such that Mx = u, where u G {0, l} m 
is chosen uniformly at random (independent of M). Thus N is a random variable. 

Dubois and Mandler [14] show for r = 3, and Dietzfelbinger et al. 13 extend to 
general r G N with r > 3 (also providing a more detailed proof) the following result 
(see 14, Theorem 3.1] and (l3j Theorem 1], and also [19]): there is a constant ay > 
such that provided a < ay, F[N > 0] — > 1 as n — > oo. 

The proof of this i n [M] is based on a second moment calculation. The analytical 
definition of ay in 13 14] is not obviously the same as our definition of a r , but the 

13, Proposition 3 and equation (4)]) seems to match our 

13] are consistent with our a T . 
this result implies that if a < a r there is, for n large 



sec 



definition in terms of cores 
definition of ay, and the numerical values in 

If we accept that ay = a 
enough, no non-zero left null vector for M, as follows. Suppose that a non-zero y satisfies 
y ■ M = 0. Then N > implies y ■ u = 0. So F[y ■ u = 0] > P[N > 0] 1, which 
contradicts the easy observation that F[y ■ u = 0] = 1/2 for non-zero y. 

We may then deduce that in the case with W n = n A r, our Theorem 2.3 may be 



strengthened to n T n — > ay in probability. This implies that for a in the interval 
form of substantialism occurs; existence of any left null vector is unlikely, but 
if there is one, there are lots of them. 



2.7 Results for other random matrix models 
2.7.1 The case of fixed weight vectors with r = 1 or r = 2 

The classical cases of the constant weight model W n = n A r in which r G {1, 2} exhibit 



different behaviour from the case r > 3. Recall the definition of T n from (2.5). 

> Z n 1/2 } = exp{-z 2 /2}. 



Proposition 2.9. (i) For r = 1, for any z > 0, lim r 

(ii) For r = 2, for any z G (0, 1), 



lim P[T n > zn/2) = (1 - z) 1/2 exp <j - + 



(2.29) 



Proof. In the case r = 1, Xi,X2, . . . ,X m correspond to m repetitions of the experiment 
of placing a ball uniformly at random in one of n urns. Then T n is the first trial at which 
a ball is placed in an urn which is already occupied, and its law is given by solution to 



the birthday problem 16 p. 33]: 

m— 1 

logP[T n > m] = J^log 

3=1 



mm 



2n 



+ 0(m 3 /n 2 ), 
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provided m = o(n). Take m = zn 1 ^ 2 to obtain the result. 

In the case r = 2, T n is the same as the number of edges which have to be added in 
the Erdos-Renyi random graph process in order for the first cycle to appear. The formula 



(2.29) has been derived by Janson [20, Theorem 8.1]. □ 



2.7.2 Uniform non-zero random vectors over a finite field 

For the sake of comparison with sparse matrix phenomena, the case where X is selected 
uniformly at random from {0, l} n \ {0} is worthy of mention. 

In fact we give a result for the more general finite field GF[g] for arbitrary prime 
power q. Let E™ := {0, ...,q — l} n . Suppose (for this section only) that Xi,X 2 ,... 
are i.i.d. random vectors that are uniformly distributed over E™ \ {0}, and let M(n,m) 
be the m x n matrix over GF[g] with rows X\, . . . ,X m . We write P* for probability 



associated with this model. Define T n by (2.5) (with addition mod q). The following 



elementary result proves that n + 1 — T n is likely to be a small integer, uniformly in n. 
Write Z+ := {0,1,2,...}. 

Proposition 2.10. Suppose that X\,X 2 ,... are independent and uniformly distributed 
on \ {0}. As n — > oo, n + 1 — T n converges to a distribution on Z + : for any r G Z +; 

oo 

P*[T n > n + 1 - r] ->■ - q- j ) =: p r G [0, 1], (2.30) 

j=r 

where p = and p r ~ exp{— g 1_r /(g — 1)} as r — > oo. Moreover, the following lower 
bounds apply: for n e N and 1 < r < n, 



P*[T n > n+ 1 -r] > 



exp{— q 1 r } if q > 3 
expi-p 1 ^} ifq = 2 



Remark. Limit results of the form of (2.30) are classical, and apparently date back 



at least to an 1895 paper of Landsberg (see 29, p. 69]). The q = 2 case of (2.30) 
corresponds to the T = n — s, m + s = case of 24, Theorem 3.2.1, p. 126], but with a 
slightly different probabilistic model: there the X$ are uniform on the whole of E%, not 
just E% \{0}; comparing the results shows that this difference in the probability measures 
used is negligible in the limit. 



Proof of Proposition 2.10. Let denote the event that {Xi, X 2 , . . . , X^} is linearly inde- 
pendent. If A k occurs, then the span of Xi, X 2 , ■ ■ ■ , X k is a subspace with q k — 1 non-zero 
elements. Since X k +i is (statistically) independent of Xi, X 2 , . . . , X k and uniform on 
E 1 } \ {0}, which has q n — 1 non-zero elements, 



P* [A k+1 | A 



q n — q k 
q n -l'' 



so that, since P*L4i] = 1, for fceN, 

fe-i 

p,[T n > k] = p„[4j = n p *K-+i i a j] = n Vrf = (i ■ q ~ n)l ~ k n^ 1 - ^ 



q — q> 



q" 

3=1 3=1 y 3=1 
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with the usual convention that an empty product is 1. Taking k = n + 1 — rwe obtain 



n-l 



P„[T n > n + 1 - r] = (1 - q' n ) r - n - T j ) -» I^ 1 " ( 2 - 31 ^ 



on letting n — >■ oo, establishing (2.30) with p r as stated there. The asymptotic form given 



for p r follows from writing p r = exp ^°^ r log(l — q ■?) and applying Taylor's theorem. 
Moreover, it follows from the fact that (1 — q-" a y- n > \ provided r < n that the 



convergence in (2.31) is in fact monotone for n>r, i.e., 

n— 1 oo 

logP*[T n > n + 1 - r] > lo S (! ~ > ^ log ( X ~ q ~ j ) " 

With /(a;) = log(l - x) + x + x 2 , we have /(0) = and /'(a;) = > for |x| < 1/2, 

so log(l — x) > —x — x 2 for all x with |x| < 1/2. Thus for q > 2 and r > 1, 

2J _ A 



logP*[T n > n + 1 - r] > - J2 <T J - J] ^~ 2j = ~'~[ ( 1 + 



g + 1 



The stated lower bounds follow. □ 
2.7.3 Random vectors of weight O(logn) 

There is a separate class of results, initiated by the classical work of Kovalenko 27 and 



Balakin j2], in which the weight is random with a law which changes as n increases; in 
the classical case it is Bin(n, (a + log n)/n) for a6l. Much more on this model is given 
in [7)[24)[28)[29], for example. 

3 Multinomial parities and random allocations 

3.1 Overview and terminology 



In this section we work towards proving T heorem |2 . 5 1 and Proposition [276] Our null vector 
problem can be naturally formulated in terms of classical occupancy problems of random 
allocations of balls into urns. 

We shall use the following terminology. Suppose W is a random variable taking values 
in Z+, and k & N, and p,pi,P2, ■ ■ ■ ,Pk are numbers in [0, 1] such that Y^=iPi = 1- (I 11 
most of the rest of the paper we assume W > 1, but for this section we can allow W to 
take value 0.) Let us say the random variable X has the Bin(W^p) distribution if for each 
n G Z + the conditional distribution of X, given that W = n, is binomial with parameters 
(n,p). Let us say that a random vector {Z\, . . . , Z k ) has the multinomial (W;pi, . . . ,p&) 
distribution if for each n G Z + the conditional distribution of (Z ly . . . , Z^), given that 
W = n, is multinomial with parameters (n;px, . . . ,Pk)- 



Recall from Section 2.1 that we assume W n (having the distribution of row weights 
for our matrix with n columns) is chosen to converge in distribution to a limiting random 
variable W . An important special case is the so-called binomial model. In the binomial 
scheme take W n = W^ m to be distributed as the number of odd components in a multi- 



nomial (W; n 1 , . . . , n l ) random vector. By Lemma 3.3 below, W^ m — > W as n — > oo, 
so this is indeed a special case. 
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We write P^ n for probability associated with the binomial allocation scheme. For the 
general model we write P Pn as before. 

3.2 Exact formulae for the allocation problem 

In this subsection n is fixed. Let Xy denote the jth component of Xj. Define the column 
sums Yj and partial row sums S^j of the matrix (Xy) as follows (in this case the addition 
does not need to be mod 2): 



Yj := y^Xjj, j G [n\; and S it j := >,X^-, JC[n], 



8=1 



Recall from (2.12) that A(n,m) denotes the event that 1 is a null (row) vector for M. 



Lemma 3.1. In the binomial allocation scheme, we have the exact formula 



¥^[A(n,m)\ = 2-»£ J (p(l - (2j/n))) ? 

In £/ie general allocation scheme, 

P Pn [A(n,m)] = 2-™^ (E p J(-l) 5 -]) m 

JCfnl 



(3.1) 



J=0 VJ/ 



(3.2) 
(3.3) 



where pj := X]r=oH?>P[W / n = r ] an d Pj,r is given by 



(■) 



n — r\ r\ n — r\ r\ n — r 



(3.4) 



Proof. Event A(n, m) occurs if and only if all the are even, so 

i + (-i)^ = ,.. 

JC[n 



n, m 



)] = E f[ ( 1± V 1 ^) = 2 ~ n E E [(-i) E ^l 



where the latter sum is over subsets J of [n], including the empty set. Since ^2j eJ Yj 
Z)i=i ^ and ^1,^) ^2,J, • • • are i.i.d., (|3T2|) follows. 



Consider the binomial allocation scheme. In the binomial model, 

where (Zi, . . . , Z n ) has a multinomial (W 7 ; n -1 , . . . , n -1 ) distribution so that Yjjej nas 
a Bin(W / , \ J\/n) distribution. Recalling that if £ ~ Bin(n,p) then E[s*] = (sp+ (1 — p)) n , 



we then obtain (3.1) from (3.2). 



In the general scheme, conditional on J^" =1 Xy = r, the distribution of Si ; j is hy- 
pergeometric with parameters (n;\J\,r). We do not use the generating function (see 
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e.g. |32j p. 17]) explicitly, but apply the hypergeometric probability mass function dir- 



ectly to (3.2). 



Let H C [n] denote the set of values of j for which Xy = 1. For r G [n], we write E r for 
expectation in the case where P[W n — r] = 1. Instead of fixing J C [n] and choosing if as 
a uniform random r-subset, we obtain an exact formula for E, r [(— l) Sl - 7 ] by fixing j = \ J\ 
and a r-subset H, and selecting J uniformly from the j-subsets of [n]. The probability p^ r 
that Si t j := |i? PI J| is even is given by summing probabilities for \H n J\ G {0, 2, 4, . . .}, 
giving the expression in (pub. It follows that E r [(-l) 5l > J ] = 2p i>r - 1, and hence 



EpJ(-l)^] = ^(2^, r -l)P[^ 



2p' 



(«) 



Substitution of this into (3.2) gives (3.3) 



□ 



3.3 Asymptotics in the binomial model 

The remaining parts of Section [3] are concerned with asymptotic analysis of the quantities 



in Lemma 3.1 First we state a result that will enable us to work primarily with even m, 



which has technical advantages. 

Lemma 3.2. Suppose that W n — > W and F[W G 2Z] > 0. Then for any m > 3, 
logP Pn L4(n,m-3)] +0(logn) < logP Pn L4(n, m)] < logF Pn [A(n, m + 3)] + O(logn). 

Proof. The fact that F[W G 2Z] > and W n W implies that there exist e > and 
r G 2Z such that P[W n = r] > e for all n large enough. For any m > 3, suppose that 
A(n, m — 3) occurs. Then A(n, m) will occur if the 3 additional rows themselves constitute 
a hypercycle. With probability at least e 3 , these new rows each have r units, and given 
this, there is a probability at least n~ 2r , say, that these units form a hypercycle. In other 
words, logP Pn [A(n, m)] > \ogF Pn [A(n, m — 3)] +0(logn). Applying this inequality twice, 
once with m + 3 in place of m, gives the result. □ 

Recall that in general we assume W n — W . Next we give an elementary lemma that 
confirms the binomial model's place in this framework. 

Lemma 3.3. For W a Z* + -valued random variable, let W^ m be the number of odd compon- 
ents in a multinomial (W; n -1 , . . . , n~ l ) random vector. Then W^ m W as n — >■ 00. 

Proof. Let W and W^ m be coupled in the natural way. Then for each k G N, by the 
union bound 



P[W„ bin + w I w = k] < 



which tends to zero as n — > 00, and the result follows easily from this. 



□ 



We will prove Theorem 
binomial setting, usin 
as discussed in Section 



iing pTT 
3tion [6j 



2.5 (in Section 3.5) by first showing that (2.13) holds in the 



approximation argument described in Section 3.4 



l|) and the Stirling approximation for the binomial coefficients 
Then we will extend this to the general setting using an 

We start by proving a slightly more 



general statement than (2.13) in the binomial case, which we will also need later in the 



proof of Theorem 2.2 
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Lemma 3.4. Recall the definition of R p from (2.14)- Then R p (a) is continuous and 
nondecreasing as a function of a. Suppose that either (i) m n G 2Z for all n; or (ii) 
F[W G 2Z] > 0. Suppose that there exist a\, oi with < ct\ < a 2 < oo such that, for all 
n sufficiently large, a± < m n /n < a 2 . Then, 



limsupn-MogP^fA^mn)] < -R p {ai); 

n—>oc 

liminfrrMogP^ n [A(n,m n )] > -R p {o 2 ). 
In particular, if m n /n — >■ o > 0, then 

lim n-HogFf»[A(n,m n )} = -R p (a). 



(3.5) 



(3.6) 



Proof. Suppose that m n /n G {0.1,0.2). First assume that the m n are even. By (3.1) and 
Lemma |6.3|f iii) , 



FT\A(n, m)] < {n + l)2" n max 

P " 0<j<n \J 



n 



n 



W-{2 3 /n))\ 
" 27))" 



< (n + l)2" n sup 

7 e [0,1/2] V7 n 

where we set ( n ) = if x is a not an integer in {0, 1, ... , n}. Using the upper bound on 
binomial coefficients from the first inequality in (6.3), we get 

F^[A{n,m n )} < (n + 1) sup (2 7 ^(1 - 7) 1 " 7 )"" (p(l - 2 7 )) m ". 

7 6[0,l/2] 



By monotonicity, we then obtain 

-1 



n -i \ og ^[A{ n ,m n )] < n- 1 log(n + 1) + log sup g mn/n {^), 

76[0,l/2] 



(3.7) 



where we have set 



9a(l) - 



(p(l-2 7 )) Q 



2 7 7(1 -7)1-7' 

so with R p as defined at (2.14), R p {a) = — logsup Te r 01 / 2 i g a {j)- Note that g Q ( 7 ) is con- 
tinuous in 7 for a > and 7 G [0, 1/2], and g a {l) is nonincreasing in a; this monotonicity 
implies, by Dini's theorem, that if a! — Y a monotonically then g a > converges uniformly to 
g a on the compact interval [0, 1/2]. It follows that a (-> sup 76 r 01 / 2 i g a {l) is continuous as 
a function of a > 0, and is also nonincreasing in a. In particular, this shows that R p {o) 
is continuous and nondecreasing in a, as claimed in the lemma. Moreover, we obtain 
from (3.7) and the fact that m n /n > a± that 

limsupn- 1 logP^ n [A(n,m n )] < log sup # ai ( 7 ), 

n->oo 7 S [0,1/2] 



which gives the first inequality in (3.5). 

For the second inequality, we have from (3.1) and (6.4) that for any integer i n < n/2, 

F^[A(n,m n )] > 2""ff )(p(l - (2i n /n)))^ > e" 1 ' 6 ( ^— - \ \g mn/n {i n /n)T, 
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using the fact that m n is even. Then, since m n /n < a 2 , 



Ff»[A(n,m n )]>e 



-1/6 



11 



2ni r 



in 



in) 



1/2 



(9a 2 (in/n)) n . 



Now use continuity of g a to choose a sequence of integers i n < n/2, n G N, such that 
ga 2 (in/n) — > sup 76 r 01 / 2 ] g a2 (l), with i n — > oo and n — i n — > oo as n — > oo. The lower 
bound in (3.5) follows, for m n even. 

The results in (3.5) extend to the case of odd m n with F[W G 2Z] > by Lemma 
3.2, which is applicable here by Lemma 3.3 Finally, (3.6) follows from (3.5) on taking 
oi\ = a — e and a 2 = a + e, for arbitrary e > 0, and using the continuity of R p . □ 



3.4 Approximation by the binomial model 

The exact formula ( 3.1[ ) is simpler to work with than the more complicated exact formula 
(3.3), but intuition suggests that the asymptotics of any of the models in the class with 

W n — ^-t- W should be similar. In this section we quantify this intuition. 

Lemma 3.5. Suppose that W n — W and either (i) m n G 2Z for all n; or (ii) F[W G 
2Z] > 0. Suppose that there exist a\,a 2 with < «i < a 2 < oo and n G N such that 
ct\ < m n /n < a 2 for all n > uq. Then, uniformly over sequences m n satisfying the given 
conditions, 



0. 



lim n- 1 \\ogF pn [A(n,m n )] - logP^ n [A(n, m n )\ \ 

In particular, if m n /n — > a > 0, 

lim n~ l \ogF Pn [A(n,m n )} = lim n" 1 logP^ n [A(n, m n )}. 



(3.8) 



Proof. Denote the weight of row 1 by W n in the general allocation scheme, and by W% 



bin 



in the binomial scheme. Then W„ 



W and W^ in 



3.2 



probability space where F[W n ^ W^ m ] — > 0. As in Section 
{W n = W^ m }, the (conditional) law of S^j is the same as in the binomial model. Thus 



W, and we can work in a 
set Si j 



sup |E p J(-l) 51 -'] - E^ n [(-l) 5 ^]| < 2F[W n ± W^} 0. 



JCfnl 



(3.9) 



First suppose that the m n are even. By (3.2) with (3.1) and (3.9), there exists a 
triangular array of numbers (Sj >n ,j G [n] U {0},n G N) satisfying max <j< n \Sj >n \ — > as 
n — > oo, and 



F Pn [A(n,m n )} 



2-E 

3=0 



( p (l-(2j/n))+6. 



3,nJ 



Let e > and choose K > 1 large enough so that loe(l — K 



> 



(3.10) 



-e/«2 an d 



log(l + K : ) < e/a 2 . Then choose 5 > such that (K + 1)5 < exp{ — l/(aie)}. Finally 



assume n is large enough so that sup,- 



\Sj >n \ < 5 and a\ < {m n /n) < a 2 . 



'je[n]U{0} \ u 3,n 

We split the sum in (3.10) into two parts, depending on the size of p(l — (2j/n)). 
First suppose that \p(l — (2j /n))\ < K5. In this case 

|(p(l - (2j/n)) + S j , n ) mn \ < ((K+l)5) m - < exp{-m n /(a l£ )} < exp{-n/e}, (3.11) 
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and similarly, 



It follows from (3.10) and (3.11) that 



F Pn [A{n,m n )} 



0(1 - (2j/n))) mn < exp{-n/e}. 

(p(l - (2j/n)) + 8^ 



(3.12) 



2- Y, 

j:\p(l-(2j/n))\>K5 

+ O (exp{— n/e}) . 
Now suppose that \p(l — (2j/n))\ > Kb. In this case 

(p(l - (2j/n)) + 6,, n ) mn = (P(l - (2j/n))) m - (l + 6j, n K~ l ) mn , 
where \0j )U \ < 1. By the choice of K, e~ £ ^° 2 < 1 + Oj^K^ 1 < e £//ci2 , and hence 

exp{-en} < (1 + 6^ n K~ l ) mn < exp{CTi}. 

Therefore 

(p(l - (2j/n)) + 5 hn ) m - = (p(l - (2j/n))) m * expfo, B n}, 



(3.13) 



where \£j >n \ < £■ Hence for the sum on the right-hand side of (3.13), there exists e n with 
\e n \ < e such that 



2- £ 

j:\p(l-(2j/n))\>KS 



{p{l-{2 3 /n)) + 5 ) m ™ 

(p(l - (2j/n))r n 



= 2 n exp{e n n} ^ 

j:\p(l-(2j/n))\>KS 

using the assumption that m n is even so all the terms in the sum are nonnegative. 
Then by (3.12) and a similar argument to (3.13), the last displayed quantity is equal to 
P^ n [y4(n, m n )} exp{e n n} + O (exp{(e — e _1 )n}). Combining this with (3.13) we obtain 

F Pn [A(n,m n )} = F b p ^[A(n,m n )} exp{e n n} + O (exp{(e - , (3.14) 

uniformly in n (and m n ), the implicit constants depending on a\ and a2- 
It follows from (3.14) that 

logF Pn [A{n,m n )} = logP^ n L4(n, m n )\ + e n n + log ( 1 + 



F^[A(n, m n )\ exp{e n n} 



Pr^[.A(n, m n )} > exp{—nR p (a2) — en} : for all n large enough. So we may take e > 



where A n = O (exp{(e — e 1 )n}) is the final term in (3.14). By Lemma 3.4, we have that 
F h ™[A{n,m n )\ > exp 
small enough so that 

log I 1 



F^[A(n,m n )}exp{e n n} 



0(exp{-n}), 



say. Hence 



n' 1 logF Pn [A(n,m n )) = nr 1 logF^[A(n,m n )] +e n + o(l 



Since \e n \ < e and e > was arbitrary, (3.8) follows in the case of even m n . In the 
other case, Lemma 3^2 yields the same conclusion. The final statement in the lemma 

which says that lim^oo n~ l F h p^[A{n : m n )] exists in (0, oo) 

□ 



then follows from Lemma 
when m n /n — > a > 0. 



3.4 
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3.5 Proofs of Theorem |2.5| and Proposition |2.6 



Now we can complete the proofs of Theorem |2.5| and Proposition 2.6 



Proof of Theorem 2.5, The theorem is now a consequence of Lemmas 3.4 and 3.5 □ 



Proof of Proposition 2.6 Set 



/ a (7) := lof 



;i-2 7 ) f 



2 7 7(1 - 7 )!-7 



(3.15) 



In the case p(s) = s, it follows from Theorem 2.5 that n 1 log7r„(m n ) — > sup 76 r 0>1 /2] f a (l)- 
Proposition |2.6| will follow once we prove that, setting a = AtanhA, 



sup 70,(7) = — (AtanhA)(l — log(tanhA)) + log(coshA). 

76[0,l/2] 



(3.16) 



Note that / Q ( 7 ) -> -00 as 7 f 1/2. Differentiating (3.15) gives, for 7 £ (0, 1/2), 



d7 



fa (7) 



2a 



1 - 2 7 



+ log 



1-7 

7 



which is zero at 71 := 71(a) £ (0, 1/2) defined implicitly in terms of a by 



a 



27i) log 



7l 



7i 



(3.17) 



One can verify that for a > (3.17) defines a unique stationary value 71 £ (0,1/2) 
which is a local maximum, since for 71 £ (0, 1/2) the right-hand side of (3.17) is positive, 
continuous, and strictly decreasing as a function of 71, vanishing at 71 = 1/2; this local 
maximum is indeed the maximum of f a {l) for 7 £ [0, 1/2] since f a ("i) — > 00 as 7 I (and 
also /:(7i) < 0). 

Setting A = | log we see that AtanhA = a as given by (3.17), since we get 

tanhA = 1 — 27x. To verify (3.16) we need to express f a {li) in terms of A to get the 
expression on the right-hand side of (3.16). We have 

/a(7i) = alog(l - 271) - log 2 - 7! log7i - (1 - 71) log(l - 71) 

= (A tanhA) log tanh A + log cosh A + ((1/2) - 7l ) log ^ x + (71 - (l/2))log(l -71), 



where we have used the fact that log tanh A = log(l — 271) and log cosh A 



ilOKfl 



tanh A) = — log2 — \ log 71 — \ log(l — 71). Collecting the terms involving 7! in the last 
displayed equation, we see that they simplify to —AtanhA = —a as given by (3.17), so 
we verify (3.16). □ 



3.6 Alternative proof of Proposition 2.6 via Poissonization 



We give an alternative proof of Proposition 2.6 based on a Poissonization device (as used 
by Kolchin in his proof of Theorem 2 in [23]) and large deviations arguments of a slightly 
different flavour from those in the proof above. The proof in this section is direct, avoiding 
the general Theorem 2.5, but does use instead some relatively deep local limit theory. 



The following result can be found for example in 23 
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Lemma 3.6. Suppose that Zi } Z 2} ... are independent Poisson random variables with 
mean \i > 0. Let Zf, Z 2 , . . . be i.i.d., where the law of Zf is the same as the conditional 
law of Z\, given that Z\ is even: F[Zf — k] — F[Z\ — k \ Z\ G 2Z]. Then 



7r n (m) 



P [Zf 



m\ 



} fl + e 



P[Zx + --- + Z n 



m\ 



2^< 



(3.18) 



Proof. By the well-known relationship between the Poisson and multinomial distributions 
(see e.g. [23j p. 140] or 24, p. 15]), and Bayes' Theorem, 

vr n (m) = P [Z l G 2Z, . . . , Z n G 2Z | Z 1 + ■ ■ ■ + Z n = m] 
P[Zi + • • • + Z n = m | Z x G 2Z, 



m\ 



,z n e2Z] 

— ■ F[Zt G 2Z 



□ 



P[Zx + ■ • • + Z„ 

which with the expression for F[Zx G 2Z] in Lemma |Ofii) yields gl§ 

Lemma 3.7. Le^ X!,X 2 ,... be an i.i.d. sequence of Z-valued random variables, S n : = 
X\ + ■ ■ ■ + X n , and (x n ) ng N a sequence of even integers. Suppose that E[e* Xl ] < oo for 
some t > 0, P[Xi = 0] A F[X X = 2] > 0, and x n = nE[Xx] + o(n). Then 



lim n 1 logPfSn 



0. 



Proof. Write \x = E[Xi] and a 2 = Vax[Xi], which is finite and positive by the conditions in 
the lemma. Write Fj = Xj— n/i and y n = n~ 1 / 2 cr~ 1 (x n — n/i), so E[Yj] = 0, E[YJ 2 ] = a 2 , and 
y n = o(n 1 / 2 ). If Z n = n~ 1 l 2 o~ 1 Y^=i ^i) Richter's local central limit theorem 18, Chapter 
7, §§1 and 4] tells us that 



F[Z n = y n ]=Q [n- 1 ' 2 exp |-^ 2 ( x + 0f 



n 



-1/2 



3/n)) 



which is exp{o(n)}, since y n = o(n 



-l/2\ 



□ 



Second proof of Proposition 2.6 , From Lemma 3.6, n 1 log7r n (m) can be expressed as 
n- 1 logP[Zf H + Z% = m] - // 



-i 



logPfZx + • • • + Z n = m] + log cosh /i - p. (3.19) 



By assumption, m = m n is such that m n /n — > a > 0. The proof proceeds by choosing /z 
so that EfZf'] = a; then the first term in (3.19) vanishes in the limit by Lemma 3.7, and 
the proof of the theorem then reduces to evaluating the other logarithmic rate. 

We choose /i so that E[Zf] = a; by Lemma 6.1 ii) this means /ztanh/i = a so that 
/i = A. Since Z\ + ■ ■ ■ + Z n is Poisson with mean nX, 

lim n _1 logP[Zi H h Z n 



m r . 



1 , (nA) m - 
lim — log 

n->oo n m r 



-A + lim — - 

n— >oo fl 



lognA 



m. 



\ogm r 



Stirling's formula implies that n 1 logn! = log(n/e) + o(l), so that 



lim n" 1 logP[Z 1 + --- + Z n 



-A + lim — - 



log 



nAe 



+ o(l) 



-A + a (log A — log a + 1) . 



Combining (3.20) with the /i = A case of (3.19) we complete the proof. 



(3.20) 

□ 
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4 Proofs of main results 



4.1 Exact formula for the expected number of null vectors 

Let jV(n, m; £) denote the number of left null vectors of weight £, so that 

TO 

Jsfjn, m) = ^ N{n, m; I) ■ (4.1) 

£=0 

The value of A/"(n, m) is the number of collections of rows of M(n, m) which sum to 
(mod 2), and for each set of £ rows the probability that it sums to is F Pn [A(n, £)]. Hence 



E Pn [X(n,m;£)} 



ni 



F Pn [A{n,£)), and 



E Pn [Af(n,m)]=Yl (? 



1=0 



(4.2) 



(4.3) 



We can thus express 'E Pn [Af{n, m)] using our exact formulae for F Pn [A(n, £)} given in 
Lemma 3.1 The proofs of our main results, Theorems 2.7 2.2 and |2.3[ will be based 
on an asymptotic analysis of (4.3). As in the proof of Theorem 2.5 (see Section 2.3) it 



is most convenient to work in the binomial model, for which W^ m is the number of odd 
components in a multinomial (W; n~ l , . . . , n~ l ) vector. Thus a key step in t he proof will 

be showing that, in the general case of W n W, the expression in rt4.3fo can be well 



approximated by the binomial case. First, in the next section, we make some preliminary 
computations. 



4.2 Preliminaries 

Before embarking on the main proof, we study the rate functions that will appear. Define 



and recall from (2.1) and (2.2) that F p (a) = sup 7g [ 0il / 2 ] Fp, a (l) an d o* p = infja > : 
F p (a) > 0}. Notethat for 7 6 [0, 1/2], p(l - 27) > 0. By continuity, ^,0.(7) attains its 
supremum over 7 G [0, 1/2]; we denote by 70 := 70(a) G [0, 1/2] the smallest point at 
which the supremum is attained. 

We collect results on F p (a) and a* in the next lemma, which will enable us to complete 
the proof of Proposition 2.1| 

Lemma 4.1. Suppose that F[W = 0] = 0. For any a > 0, F p (a) > 0, and F p is 
continuous and nondecreasing. The threshold a* enjoys the following properties. 

(i) a* G [0, 1], and F p (a) = for a < a* but F p (a) > for a > a*. 

(ii) If a < a*, then for any e > 0, sup 7g[0 (1/2) _ e] F Pi0t (j) < 0. 

(Hi) If a > a*, thenj (a) G [0,1/2). 

(iv) Suppose that W is another N -valued random variable, with p(s) = K[s w ], such that 
p{s) < p(s) for all s G [0, 1]. Then a* p > a*. 
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( v ) a * = if and only if¥[W = 1] > 0. 

(vi) IfF[W = 2] = I, then a* = 1/2. 

(vii) IfE[W] < oo, then a* < 1. 



Proof. By Lemma gg) p(0) = F[W = 0] = and p(l) = 1; hence ^(1/2) = and 
Fp,a(ty — ( a — 1) l°g 2, so that F p (a) > (a — 1) + log 2 > 0. Since F Pia {pi) is nondecreasing 
as a function of a > 0, Dini's theorem implies continuity of F p (a) as a function of a > 0. 

For part (i), -F p (aO > (a — l) + log2 implies that F p (a) > for a > 1, so that 
a* < 1. On the other hand, for a < 1, To (a) G (0, 1/2], since by continuity there is some 
neighbourhood of for which F Pt01 (y) < 0. 

Since F Pta (j) is nondecreasing as a function of a, for a' > a, F p (a') > F p a i (79(a)) > 
F p (a), i.e., F p is also nondecreasing. Hence F p (a) > for a > a*. Also, the fact that 
F p (a) = for a < a* is immediate from the definition of a* and the fact that F p (a) > 0. 
Then F p (a*) = by the continuity of F p established above. Thus we obtain part (i). 

For part (ii), suppose that a < a*. Suppose for some 70 G [0, 1/2) that -F P)0! (7o) > 0. 
Since p(l — 27) > for 7 < 1/2, F p ^{^q) is strictly increasing in a, so there exists 
a' E (at, a*) for which F P)a /(7o) > 0, contradicting the definition of a*. This gives (ii). 

For part (iii), suppose that a > a*. Then F p (a) > by part (i) of the lemma; since 
F P)Q (l/2) = 0, the supremum is attained in [0, 1/2). 

For part (iv), we have that for any 7 G [0, 1/2], F p>a (7) > F P!a (j), since p(l — 27) > 
p(l - 27). So F p [a) < F p (a) for all a > 0, and hence > a*. 

For the remaining parts of the lemma we use more detailed properties of the generating 



function p(s) (see Lemma 6.3). For part (v), differentiating in (4.4) we obtain 

d _ . . 2ap'(l -2 7 ) A - 7 \ 



this is well defined at least for 7 G (0, 1). At 7 = 1/2 this equates to — 2aP[W / = 1], since 



by Lemma Q^O) = F[W = 1] and p(0) = 0. So if F[W = 1] > 0, ^(7) is equal to 
at 7 = 1/2 and its derivative there is negative for any a > 0, so that, for any a > 0, 
Fp, a {l) > for some 7 < 1/2. The 'if part of part (v) follows. 

Conversely, suppose that F[W = 1] = 0. Then the previous argument shows that 
F p<a (l/2) = F' p>a {l/2) = 0, while a calculation shows that F' p [ a (l/2) = 4ap"(0) -4. Hence 
by continuity there exists 5 > such that for a < 5 and (1/2) — 5 < 7 < 1/2 we have 
F' p \ a {l) < -3. Hence by Taylor's theorem, ^(7) < for a < 5 and (1/2) -5 < 7 < 1/2. 
Also, F Pt a(j) —> — log(27 7 (l — 7) 1-7 ) as a — > 0, which is strictly negative apart from 
at 7 = 1/2. Thus by Dini's theorem, for all a small enough we have F pa {^{) < for 
7 < (1/2) — 8. So all together we have shown that F PjCl ('y) < for all a sufficiently small. 
Hence a* > in this case, giving the 'only if part of (v). 



For part (vi), suppose that F[W = 2] = 1, i.e., p(s) = s 2 . In this case, (4.5) has a 
zero at 7 G [0, 1/2) if a = 3(7) where 

1 + (1 - 2 7 ) 2 /1-7 



4(l-2 7 ) V 7 

We claim that 3(7) is decreasing on [0, 1/2), with a unique minimum of s(l/2) = 1/2. To 
verify this, we show ^'(7) < for 7 G [0, 1/2), which, after simplification, amounts to 

(l + (l-2 7 ) 2 )(l-2 7 ) fl->y 



?7 2 (1 — 7) 2 \ 7 
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Setting z 



27, it suffices to show that 



(1-z 2 ) 2 



> §log(^r§) for z G (0,1], which 



can be verified by term-by-term comparison of the corresponding power series, namely 
z + 2z 3 + 3z 5 + -- ->z + ^ + ^ + -- -. Hence 3(7) = a has no solution for a < 1/2, 
in which case the only stationary value of F P)Ql is at 7 = 1/2, necessarily the maximum. 
Hence a* > 1/2. On the other hand, if a > 1/2 then F' p \ a {l/2) = 8a - 4 > 0, while 
F pa (l/2) = F' (1/2) = 0, so by Taylor's theorem and continuity there exists 7 < 1/2 
with F Pta (j) > 0. Hence a* = 1/2, proving (vi). 

Finally we prove part (vii). If E[W] < 00, Lemma 6.3[ ii) implies that, as 7 4- 0, 
p'(l — 27) = E,[W] + o(l). Thus the final term on the right-hand side of (4.5) dominates 
in the 7 J, limit, and there exists S > such that j--F P)Q ,(7) > 5 for all 7 G [0, 6} and all 
a G [0, 1]. Then by an application of the mean value theorem, F p ^ a (5) > (a — 1) log 2 + 5 2 
for all a G [0, 1]. Thus taking a < 1 close enough to 1 we see that F pa (6) > 1, which 
implies that a* < 1. □ 



Proof of Proposition 2.1 Extract the relevant parts of Lemma 4.1 



□ 



4.3 Approximation by the binomial model 



In Section 3.4 we showed (in Lemma 3.5) that ~P Pn [A(n, m n )\ can be well approximated by 



¥^[A(n,m n )] on the logarithmic scale, provided that m n /n — > a. The following result 
is an analogous approximation lemma for E Pn [Af(n, m n )]. One could obtain such a result 
from Lemma 3.5 applied to (4.3), with some work (including dealing separately with 
terms with I = o(n): cf Section 4.4 below). However, it is more convenient to proceed 



directly, albeit using similar ideas to the proof of Lemma |3.5| in this case we are helped 
by the fact that K Pn [J\f(n, m)} possesses monotonicity properties absent for F Pn [A(n, m)]. 

Lemma 4.2. Suppose that W n W and m n /n — > a > 0. Then 



lim n 1 1 log E Pn [Af( 



n, m r 



n, m r 



Proof. We use a coupling argument, constructing the general model with row weights 
distributed as W n — > W on the same probabil ity s pace as the binomial model with row 
weights distributed as W^ m — > W (by Lemma 3.3). For any n, we can use a probability 
space in which, for each row, the weight in each model converges almost surely to a copy 
of W. Indeed, let W(1),W(2), . . . be independent copies of W. Using the Skorokhod 
representation theorem, we may take W n> i, W nt2 , ... as independent copies of W n , being 
the weights of the rows in the general model, such that W U;i —> W(i) almost surely. Also, 
take W^f to be the number of odd components in a multinomial (H^i^n -1 , . . . ,n^ x ) 
distribution, so that W^ 1 ?, , . . . are independent copies of W^ m and the weights of 
the rows in the binomial model. 

Let A n (i) := {W n<i 7^ W 7 ^ 11 }. Then for any 5 > 0, we may take n large enough so 
that PL4 n (z)] < 5, uniformly in i. Let K(n,m) = YlT=i ^-A n {i) denote the number of 'bad' 
rows. Then K(n, m) is stochastically dominated by a Bin(m, 5) variable. In particular, 
for any fixed e > and any C < 00, standard binomial tail bounds imply that we may 
take 5 small enough, and hence n sufficiently large, so that 



F[K(n,m n ) > en] < P[Bin(2cm, 5) > en] < exp{-Cn}. 



(4.6) 
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We claim that each row added to a matrix can increase the number of null vectors by at 
most a factor of 2; this follows from (1.1) and (1.2). Hence 



| log 7V(n,m n ) - \ogN'{n,m n )\ < K{n,m n ), (4.7) 

where jV(ti, m n ) is the number of null vectors in the matrix with the W n> i and Af'(n, m n ) 
is the number of null vectors in the matrix with the W^f. In particular, on {K(n, m n ) < 
en}, the bound in (4.7) is en. The statement in the lemma follows from ( |4.6 ) and (4.7) 



since e > and C < oo were arbitrary. □ 
4.4 Null vectors consisting of few rows 

In the asymptotics of E Pn [J\f(n,m)], it turns out that null vectors of low weight play a 



distinct and important role. Recall (4.1). The main result of this section is the following 



lemma, which exhibits a polynomial growth rate for null vectors of few rows. 

Lemma 4.3. Suppose that there exist r > 3 and T\ < oo such that P[r < W n < r±] 
for all n. Suppose that m n /n — > a > 0. Then there exists 5 > such that 



£ - 

2<£<5n 



E Pn [Ar(n,m n ,£)} = 0(n 2 - ro ). (4. 



Remark. The exponent 2 — r in (4.8) cannot be improved when PfW 7 ,,, = r ] > 0, 
because E[jV(n, m n ; 2)] is itself of order n 2 ~ r °. Indeed, there are of order weight-2 
candidate vectors, and each is null if each of the two corresponding rows have ro non-zeros 
in matching positions, an event of probability of order n~ r °. 

Proof of Lemma \4 ■ 3[ Let n,£ £ N. Let R = R(n,£) denote the 'column range' of the 
matrix M(n,£), that is, the number of columns of degree at least 1. Let us take k = 
k(£) £ N, to be chosen later. We shall estimate F Pn [A(n, £)] by considering separately the 
events R < k and R> k. 

We interpret M(n, £) as arising from a random allocation scheme, where for each row 
we throw balls at random into n urns (columns). If R < k then there is some set of k 
columns, such that all the balls land in these k columns. For each ball, the probability 
that it lands in one of the first k columns, given that the other balls cast so far for that 
row all land in the first k columns, is at most k/n. Hence since for each row at least r 
balls are cast, and we consider £ rows here, 

If R > k then to have A(n, £) occur we need to have each of the columns in the range get 
hit at least twice (i.e., have degree at least 2). Thus if R > k and A(n,£) occurs there 
is a collection of k + 1 columns such that each column in the collection gets hit at least 
twice. Let B(i) be the event that the column i gets hit at least twice. The probability 
that a particular entry is 1, given the values of up to k other entries in the same row, is 
at most ri/(n — k). Hence the union bound yields for 1 < j < k + 1 that 

w Pn [B(j) I n£ffl(«)] < ^ ( " 



n 
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and hence we have 



Ppjrti 1 ^)] < 



n 



n — k 



k+l 



so that by the union bound, provided fc < n/2 we have 



F Pn [{R>k}nA(n,£)} < 



n 

k + l 



2r x 
n 



k+l 



< 



n -(k+l)(l2{k+l) c k 

(fc + 1)! 



k+l 



where we put C\ = 2r\. Combined with (4.9) this gives 



^-Crofclro n -(k+l)£2(k+l) c k+l 
< r~. + - 



k\ 



(k + l) 



We assume m n /n — > a G (0, oo), so for n large enough so that m n < (1 + a)n we have 
for all i, and for k < n/2, that 



E Pn [Af(n,m n ;£)) 



< 



((a + l)nY\ fn k ~^k £ro n -(fc+i)£2(fc+i) c fc+i 



+ 



(4.10) 



k\ (Jfc + l)! 

Taking k = £ + r$ — 2, we obtain for each fixed £ that for some constant c(£) we have 

E Pn [N{n,m n -£)\ < c(£)(n e(1 - ro)+e+r °- 2 + n^' 1 ) 

= c(£)(n ( ^ 1)(2 - n,) + n 1 - 1 ' ), (4.11) 

which is O(n 2 ~ ro ) for any fixed £ > 2. 

Fix an integer K > 2, to be chosen later, and consider K < £ < 5n. Now put 
k = £(r — 1) — \£ /2~\. Assume 5 < l/(2r ); then for £ < 5n this choice of fc satisfies 
— n/2, so that (4.10) remains valid. Also note that, since r > 3, k > ^ ~ 1 > £ 
provided £ > 2. 

By the bound e e > |y and similar for fe, there are constants C2, C3, C4 such that the first 
term in the right side of (4.10) (i.e. the product of the first factor with the first term in 
the second factor) is bounded by a constant times 



n 



(l-r )+k fcZr 0c Z n - \i/2\ groi^ 



£ e k k 



< 



£i£(ro-l)e-\£/2\ 
\i/2\ 



n 



(4.12) 



where for the inequality we used the fact that £ < k to replace k k by £ l in the denominator. 

Similarly, there are constants c^,Cq,c-i such that the second term in the right side of 
(4.10) is bounded by a constant times 



11 



£-k-l£2£(r -l)-2\e/2-]+2 c e n l{2-r )+\l/2\ £l(2r Q -2)-2\i/2'\ ^ £2 



^(k + l)^ 1 



< 



< 



cj£\ 
n J 



££r -\e/2]+l n 
£(r -2)-\e/2] 

(£/n) 
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Combining with (4.12), since r > 3 so r — 2 > 1 and \£/2~\ < {1/2) + 1, we can find a 
constant Cs such that for 2 < i < n we have 



EpJJv^mn^)] < c 8 



n J 



By calculus we have that (^Efj is decreasing in x < n/{cge), so provided 5 < l/(cse) 
the last bound is maximized, over K < I < 5n, at £ — K, so that 



E Pn [M{n,m n ;£)) < c 8 5n 



K/2 



K<e<s 



which is 0(n 2 r °) provided we choose K so that K/2 > tq — 1. 



□ 



4.5 Proof of Theorem 2.2 



First we prove Theorem 2.2 for the binomial model, i.e., for P^ in on the right-hand side 



of (4.3), and then use Lemma 4.2 Specifically, we prove the following result. 



Lemma 4.4. Suppose that F[W > 1] = 1. Suppose that m n /n — > a 6 (0, oo) as n — > oo. 

lim n-MogE^[A/-(n,m n )] = F». (4.13) 



T/ien with F p {a) as defined by (2.2), 



Proof. From (3.1 ), 

m / \ 

( p yf»[A{n,£)]<{m+l){n + l)2- n sup sup 



£=0 



m \ In 



0<e<m0<j<n \ c j \j 

m \ I n 



<(m + l)(n+l)2 n sup sup , 

/j 6 [o,i] 7 e[o,i] \P«V 



setting (") = for x ^ {0, 1, . . . , n}. Write 

Ip(i-2 7 )IM q 



|p(l-(2j/n))|' 
|p(l-2 7 )r, (4.14) 



S a (A7) := 



^{l-py-PJ V27 7 (l-7) 1 " 7 



Taking m = m n = 0(n) in (4.14) and using the first inequality in (6.3), we obtain 



n 



1=0 



MogV n )F^[A(n,£)} < O^ 1 log n)+ log sup sup 5 m „ /n (/3, 7 ). (4.15) 

1 ; ; / 3e[0,l]76[0,l] 



For any > 0, routine calculus (with a separate argument for B = 0) shows that 

5" 



5 + 1, 



with the supremum attained at (3 = B/ (1 + B), so that from (4.15) we have 

'(l + |p(l-2 7 )|) m "/ n 



n~HogE h p m [Af (n, m n )} < Oin- 1 log n) + sup log 

76 [0,1] 



277(1-7)1-7 



(4.16) 
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Considering the transformation 7 1— >■ 1 — 7, we see that 

'(i + lp(i-2 7 )in /(i + | P (2 7 -i)iy 



sup , 

7 6[1A1]V 2 7 7(1- 7 )l-7 



: sup , 

76[0,1/2]V 277(1-7)1-7 



;i + p(i-2 7 )) c 



< sup , 

~ 7e [0,l/2]V 27^(1 -7) 1 - 7 



since, for 7 G [0, 1/2], \p{2^ — 1 )| < p(l — 27), by Lemma 6.3[ iii). Hence from (4.16) we 



have, with F p (a) as defined at dZ2), n" 1 logE^ n [jV( 



< 0{n 1 logn) + F p (m n /n). 



Since m n /n — > a and a h-> F p (a) is continuous (see Lemma 4.1) 

\imsu-pn- 1 logE p f[N'(n,m n )} < F p (a). 



For the lower bound, we use the fact that 

\n/2\ 



K:w(^ m n)} >E(7) £ 2 ~ n ( ■ ) (p(1 - (2j>))) ' 



n 



m r . 



>2- n sup sup ( )( )|p(l-2 7 )r", 
/3e[o,i] 7 e[o,i/2] \pm n J \jnj 

using the nonnegativity of the appropriate terms for both inequalities. Using the lower 
bound in (6.4), similarly to above, we obtain that liminf n _ 5 . 00 n -1 logE^ n [A/"(n, m n )] > 
F p (a). Hence combining the upper and lower bounds, we obtain (4.13). □ 

Now we can give the proof of our main result. 



Proof of Theorem 2.2, Lemma 4.4 shows that (2.3) holds for the case where W n = W, 



rbin 



and Lemma 4.2 shows that the result carries over to the general case. For the final 
statement of the theorem, suppose that a < a* and that P[r < W n < ri] = 1 for some 



r > 3 and r% < 00. Lemma 4.3 shows that, for a suitable 5 > 0, 



Sn 

E 



F Pn [A(n,i)} = 0(n 



2-r \ 



(4.17) 



For £ > 5n, we first restrict to the binomial model. Choose e > so that (3e) 5 < 2 



-2a 



By a similar argument to (4.14), but splitting the supremum over j into two parts, 



K[A(n,i)) < (m n + l)(n + 1)2"" sup sup 

t, =Sn \ 1 / 8n<e<m n j:\j-(n/2] 

+ (m n + l)(n + 1)2 



sup 



sup 



0<£<m n j:|j-(n/2)|>en 



)|<en 



111, 



\p(l-(2j/n))\ i 



\p(l-(2j/n))f. (4.18) 



Similarly to (4.16), the second term on the right-hand side of (4.18) is bounded above by 



exp { o(l) + sup i 71 p , mn /„(7) 

7 6[0,(l/2)-e] 



which decays to exponentially fast, by Lemma 4.1 ii), since m n /n — > a £ (0,a*). On 
the other hand, for \j — (n/2) | < en, we have from Lemma 6.3[ i) that |p(l — (2j / n))\ < 3e, 
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for e > small enough, so that, since £ > 5n, \p(l — (2j/n))\ e < (3e) , so by the choice 



of e the first term in the right hand side of (4.18) tends to exponentially fast. Hence 

(4.19) 



lim sup n" 1 log J] ( m ; )F^[A(n,£)} < 0. 

l=on 



We next deduce a version of (4.19) with P Pn in place of the special case P™ 1 , using 



Lemma 13.51 once more. To this end, observe first that the F Pn -analogue of the sum in 



(4.19) consists of 0(n) nonnegative terms, so is bounded between the largest term and 
0(n) times that same term, so that 



n 



e=8n 



ni. 



V Pn [A{n, 



n 1 log max 

8n<£<m„ 



ni. 



F Pn [A(n,£)} +0(n- 1 \ogn). 



By Lemma 3.5, for any e > 0, there exist e n / with \e n /\ < e, uniformly for £ with 
5n < £ < m n and n sufficiently large, such that 



! '^[A(n,£)]exp{e n/ n}- 



So we obtain 



lim sup n 1 log \J 

n— s>oo „ r 



P p JA(n,£)]<0, 



which, combined with (4.17), yields (2.4). 



□ 



5 Cores of sparse random hypergraphs 
5.1 Hypergraphs and 2-cores 

Given a set V = {v\, . . . ,v n }, whose elements we call vertices, a non-empty subset of 
V is called a hyperedge. Given a collection £ := (Ej) of m hyperedges, we refer to the 
pair (V, £) as a hypergraph. This hypergraph may be identified with an m x n matrix A 
with entries in {0, 1} (the incidence matrix of the hypergraph), having no zero rows, as 
follows. The entry a^j of A takes the value 1 if and only if Vj G Ei, in which case we say 
row % is incident to column j, and that hyperedge Ei is incident to vertex Vj, and refer 
to (Ei,Vj) as an incidence of the hypergraph. 

The number of hyperedges incident to a vertex v is the degree of t> . Fix a hypergraph 
(V, £). For J 7 C £, the set V(J r ) C V of vertices which are incident to at least one of the 
hyperedges in T is called the vertex span of J 7 . We identify the hypergraph (J 7 , V(J-)) 
by the edge subset T that induces it, and call T C £ a partial hypergraph. A partial 
hypergraph J 7 7^ is a hypercycle if every vertex t> has even degree with respect to 
J 7 . For an incidence matrix A, a left null vector is the indicator of a hypercycle in the 
corresponding hypergraph. 

Given a hypergraph (V, £), the 2-core is defined via the following algorithm: 

1. If there exists no vertex of degree one, stop. 

2. Otherwise, select an arbitrary vertex of degree one, and delete the unique incident 
hyperedge; then return to Step 1. 
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The algorithm terminates, because the partial hypergraphs are decreasing; the terminal 
partial hypergraph, which does not depend on the arbitrary choices made in Step 2 
11, pp. 127-128]), is called the 2-core of £, denoted Core(£). Possibly Core(£) has 



sec 



no hyperedges. Figure [2] illustrates a hypergraph with 19 vertices and 12 hyperedges, of 
which 4 hyperedges are in the 2-core. 



• • • • 
• • • • • 

• • • • 
• • • 
• • • • 

• • • 



Figure 2: Pictorial representation of the adjacency matrix for a hypergraph with 19 
vertices and 12 hyperedges, whose incidences are shown as •. The first 8 hyperedges are 
not in the 2-core. The last 4 hyperedges form a linear system of rank 3 over GF[2]. The 
entire 19 x 12 system has rank 11, and so contains a hypercycle (hyperedges 9 and 11). 

The connection between the 2-core and hypercycles was exploited by Cooper [8j p. 
371], following an idea that he attributes to Molloy (see j6| p. 268]). The connection is 
demonstrated by the following useful observation. 

Lemma 5.1. Suppose that the 2-core C := Core(£) of a hypergraph (V, £) has vertex 
span V(C) C V and size (number of hyperedges) \C\. 

(i) Any hyperedge E ^ C cannot belong to a hypercycle of (V,S). 

(ii) If C = 0, then (V,£) contains no hypercycle. 
(Hi) If\V(C)\ < \C\, then (V,£) contains a hypercycle. 

Proof. If there are s hyperedges not in the 2-core C, there exists a labelling of them 
as Ei, E 2 , . . . , E s with the property that, for every j, Ej has some vertex with degree 
one after hyperedges Ei, E 2 , . . . , are removed. Suppose (V,£) has some hypercycle 
T 7^ 0. None of Ei, E 2 , . . . , E s can belong to J 7 : otherwise, there would be some minimum 
j for which Ej G J 7 , and this Ej has some vertex v of degree one in the partial hypergraph 
from which E%, E 2 , . . . , -Ej-i have been removed, which contains J 7 ; so v cannot have even 
degree in J 7 , which is a contradiction. This proves (i), and (ii) follows. For (iii), say 
c := |V(C)| < \C\ =: r. Then there are 2 r — 1 non-empty partial hypergraphs, but only 
2 C < 2 r — 1 possible indicator vectors for a set of vertices of odd degree. By the pigeonhole 
principle, there must be two distinct partial hypergraphs J 7 , J 7 ' C £ for which the sets of 
vertices of odd degree are the same. Then J-'AJ 7 ' is a hypercycle. □ 



5.2 The 2-core in uniform random hypergraphs 

In this subsection we consider a certain uniform random hypergraph model, which is 
different from (but related to) the hypergraph model induced by our random matrix 
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M(n,m n ); in Section 5.3 we will connect the two models. The incidences (non-zero 
entries) in a random incidence matrix may be viewed as edges of a random bipartite 
graph, whose left and right nodes are the row labels and column labels, respectively. A 
standard probability model for a random bipartite graph is to fix the degrees of all nodes 
in advance (subject to a consistency condition), and sample uniformly from the bipartite 
graphs with this set of left and right node degrees. 

By interpreting an incidence matrix as a hypergraph, as above, this automatically 
gives a uniform random hypergraph model, where the degree of a left node is the weight 
of a hyperedge (meaning its number of incident vertices), and the degree of a right node is 
the vertex degree defined above. Darling and Norris [12] analyse the statistical properties 
of the 2-core for such random hypergraphs under suitable conditions on the hyperedge- 
weight and vertex-degree distributions. In unpublished work of the same authors, a 
generalization to unbounded vertex degrees and row weights is given, under finite third 
moments assumptions. 

For the purposes of the present paper, we require a more modest relaxation of the 



conditions of 12 , to cover the case where the row weights remain uniformly bounded but 
the vertex degrees are approximately Poisson distributed. 

For each n, define vectors of nonnegative integers d n := (d n (k) : k G Z+) and w n : = 
(w n (k) : k G N) with J2k>o^n(k) = n and m n := J2k>i w n(^)j we assume that d n and 
w n are compatible in the sense that ^ fc>1 kw n {k) = J2k>o kd n {h) < oo. We also assume 
that m n — > oo. Suppose that for each i G N and j G Z + , 

r w n(i) v d n (j) 

n-^oo 2^ fc >x W n (k) n^oo Jl 

Define generating functions p(s) := J2k>iPk sk an d K s ) := J2k>o v k sk - We assume that 
the weights are uniformly bounded, i.e., Pk = for all w sufficiently large, and the degree 
distribution has all moments, i.e., J2 k>1 v k k^ < oo for all > 0. Under these conditions, 
p'(l) and z/(l) are the (finite) means corresponding to these distributions. 

Consider a sequence of random hypergraphs with n vertices and m n hyperedges, se- 
lected uniformly from those hypergraphs with edge weight multiplicities w n and vertex 
degree multiplicities d n . 

To present asymptotic results for the 2-cores of a sequence of such uniform random 
hypergraphs, it helps to introduce the notion of sampling a single incidence uniformly at 
random from all incidences in a hypergraph. Denote such an incidence (E, v). Denote the 
weight of E by S + 1, and the degree of v by L + 1; thus S is the number of other vertices 
in this hyperedge, and L is the number of other hyperedges incident to this vertex. Size 
bias occurs here: the event that E has weight k occurs with probability proportional to 
k times the number of rows of weight k, and similarly the probability that the degree of 
v is d is proportional to d times the number of degree d vertices. Given the p w and 
describing the limiting row weight and vertex degree distributions, we may thus compute 
a pair of limiting probability generating functions for L and S, respectively: 

oo oo 

A( s ) := E[s L ] = ^s d ; a(s) := E[s s ] = ^ a w s w , (5.2) 

d=0 w=0 



where, due to the size biasing, the coefficients in (5.2) are given by 



_ (d+ l)v d +i (w + l)p w+ i 
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Hence the generating functions themselves become: 

\f \ ^1 ( \ P'^ fx i\ 

Ks) = ^ry a(s) = WY ( } 

To avoid triviality, we assume that do = (equivalently, pi — 0), i.e., there are no 
1-edges, and A ^ {0, 1} (otherwise the 2-core is of no interest). Define 

¥>(*):= l-A(l-a(a)). (5.4) 

From the conditions ff / 1 and Aq ^ 1, we deduce that (p : [0, 1] — > R is strictly 
increasing; moreover (p(0) = 1 — A(l — cr(0)) = (since cr = 0) and cp(l) = 1 — A G (0, 1), 
so if takes values in [0, 1), and there exists a largest solution g* in [0, 1) of the equation 
ip(s) = s. That is, 

(f:=sup{se [0,1) :¥ >(s) = s}. (5.5) 

In the case where g* > and the curve y = ip(s) crosses the curve y = s (rather than 
just touching it) at s = g*, we also have 

g* = sup{s G (0, 1) : cp(s) > s}. (5.6) 

Now we can state the result on the 2-core that we shall use, which amounts to a 



variant of Theorem 7.1 of 12 



Theorem 5.2. Consider a sequence of uniform random hypergraphs associated with se- 
quences w n and d n satisfying (5.1) with p w = for allw large enough and J2d>i v ^ < 00 
for all (3 > 0. Suppose that the corresponding pair (5.2) of random-incidence generating 
functions has cxo = 0, Ao ^ {0, 1} ; and is such that g* , given by (5.5), has either g* = 
or g* satisfying (5.6). Then the following hold a.s. in the limit as n — » oo. 

(i) If g* = 0, the proportion of hyperedges which survive in the 2-core converges to zero. 

(ii) If g* > 0, then for any k G Z + with > 0, the proportion of weight-k hyperedges 
which survive in the 2-core is asymptotically (g*) k ; overall, a proportion p(g*) of 
hyperedges survive, and a proportion s*a(g*) of incidences. 

(in) If g* > 0, then for any d,k G N with 2 < d < k and v k > 0, the proportion of 
vertices of degree k whose degree in the 2-core is d converges to 



a(g*) d (l-a(g*)) k - d . 



(iv) If g* > 0, the 2-core is again a uniform random hypergraph, given its hyperedge 
weights and vertex degrees, whose distributions are determined by the previous as- 
sertions. 



As mentioned above, in 12 all but finitely many coefficients of the generating func- 



tions (5.2) were taken to be zero, but the methods admit the modest extension of this 
section, and indeed can be extended to the case where A"(l) and cr"(l) are finite, cor- 
responding to finite third moments for hyperedge weight and vertex degree distributions. 



Because of its proximity to the result in [12], we do not prove Theorem 5.2 here. 
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5.3 Application to M(n, m) 

In the previous subsection we assumed the row and column weights of our random mat- 
rix were specified in advance, but now we return to the random matrix model used in 
the rest of the paper, so our random m n x n incidence matrix A will be precisely the 



matrix M(n,m n ), described in Section 2.1, i.e., with i.i.d. rows with weights having the 



distribution of W n , and corresponding generating function p n (s) having limit p(s). 

To justify being able to apply Theorem 5^2 in this setting, we give the following strong 
law of large numbers for the empirical distributions of the row and column weights of M. 

Lemma 5.3. Suppose m n e N with m n /n — > a as n — > oo, with a > 0, and the W n are 
uniformly bounded. Let k G Z +; let Nk(n) be the number of rows of M(n,m n ) of weight 
k, and let Nk(n) be the number of columns of M(n,m n ) of degree k. Then a.s., 

lim m^Nkin) = F[W = k], and (5.7) 

n— >oo 

lim n~ l N k (n) = (5.8) 

where we set p := aE[W] = ap'(l). Moreover, the total number of incidences satisfies 
the law of large numbers 

lim n~ \ kNk(n) = lim n" 1 \ kNk(n) = p, a.s. (5-9) 

r?, — yno ' ^ n. — vno ' ^ 



k>0 k>0 



Proof. First note that m n -1 E[iVfc(n)] = P[W n = k], which converges to F[W = k] by 
assumption. To deduce almost sure convergence from this convergence in means, we use 
the Azuma-Hoeffding inequality in a standard way, as follows. Fix n and for 1 < i < m n 
let Ti be the a-algebra generated by the rows of M(n,m n ). Define ^ = 

E,[Nk(n) | J^i], with £ = E,[Nk(n)]. Since resampling a single row changes the number of 
rows of weight k by at most 1, we have for 1 < % < m that 

|6 - = |E[JV fc (n) - N k (n,i) \ H\ < 1, 

where Nk(n,i) is defined like iVjt(n) but based on a matrix with the zth row resampled. 
By the Azuma-Hoeffding inequality applied to the martingale (£ , • • • , £m n ) we obtain for 
any e > that 

¥[\N k (n) - E[iVjfc(n)]| > en] < 2exp(-e 2 n/2) 1 

so by the first Borel-Cantelli lemma, we have \Nf.(n) — E,[Nk(n)}\ < en for all but finitely 
many n almost surely. Combined with the convergence of the mean, this gives us (5.7). 

The remaining two parts of the lemma use the assumption F[W < ri] = 1 for r\ < oo. 
To prove (5.8 ) note that the weight of the first column (or any other column) of M(n, m n ) 
is binomially distributed with parameters m n (number of trials) and E[W n ]/n (probability 
of success). Hence E[JVfc(n)/n] = P[Bin(m n , E[W n ]/n) = k], and by binomial-Poisson 
convergence this tends to e~^p k jk\ as n — > oo. Given this convergence of means, we 
may prove (5.8) by a similar argument (based on the Azuma-Hoeffding inequality) to the 
one used to prove (5.7), since resampling a single row changes the number of columns of 
degree k by at most r%. 

For the final statement in the lemma, we have that 

n r i 
n' 1 ^ kN k{n) = (m n /n) ^ km~ l N k (n) -> a ^ k ^\ W = k ]i a - s - ; 

fc>0 k=0 k=0 



by (5.7), and then (5.9) follows. □ 
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Corollary 5.4. Consider the random matrix model M(n,m n ) with row weight distribu- 
tion W n — ^> W , where the W n are uniformly bounded. Suppose that m n /n — > a > 0. 
Then, a.s., taken as hypergraph incidence matrices the sequence M(n,m n ) defines a se- 
quence of uniform random hypergraphs whose row weight and vertex degree distributions 
satisfy (5.1) with p w and v& given by p w = F[W = k] and = e~^p d /d\ respectively, 
where fi := a;E[W]. 

Proof. Since the distribution of M(n, m) is invariant under permutations of the rows 
or columns, conditional on the empirical distribution of row and column weights, all 
possible outcomes with those row and column weight distributions are equally likely, so 
this conditional distribution is indeed uniform. Moreover by Lemma 5^3 the limiting 
proportion of rows of weight k is given by F[W = k] and the limiting proportion of 
columns of degree k is given by F[D = k] where D ~ Po(/i), with p := ap'(l). Hence 
conditionally on this sequence of empirical distributions, almost surely we have a sequence 
of random matrices satisfying the hypotheses of Section |5.2[ □ 



In the notation of Section 



5.2 



in this case u(s) = YldLo e 



generating function of a Po(/x) random variable, so, by (5.3), the pair (5.2) becomes 



X(s) 



(J IS) 



In this case, we have from (5.4) that 



<p{s) := 1 - A(l - a(s)) = 1 - e'^ {s) = 1 - 



-ap'(s) 



and to emphasize the dependence on a we will use the notation <p a for (p from now on. 
Recall from (5.5) that g* was defined as the largest s G [0, 1) for which <p a (s 



In 



order for the model of this section to fit into the setting discussed in Section 5.2[ we need 



to assume that a = and A ^ {0, 1}. Here A = e M = e aE \ w \ and cr = . So it 

suffices to assume that a > 0, F[W > 2] = 1, and K[W] < oo; in this case the argument 



in Section 5.2 shows that g* is well defined. 

Note that g* depends both on p and on a; in this section we write g* = g*(a) to 
emphasize the dependence on a; we will show (see Lemma 5.5) that the present definition 
is equivalent to that at (2.9) given in Section 2.2 For any solution s G [0, 1) to <p a (s) = s, 
so in particular for s 



g*(ct), provided p'(s) ^ 0, we have a — h(s) as gi ven by (2.7) 
We note some facts about g*(a); recall the definition of from (2.8). 



Lemma 5.5. Suppose thatF[W > 2] = 1 andEfH 7 ] < oo. With the convention sup = ; 
the definition (2.9) is equivalent to the definition (5.5) of g*(a) as the largest solution 



of (p a (s) = s. Also, ofi p G [0, 1], and g*(a) = for all a G [0, a p ), and for a > aj,, the 
function g*(a) is positive and strictly increasing, with g*(a) f 1 as a — >■ oo. 

Now assume also that F[W > 3] = 1 and E,[W 2 } < oo. Then the following hold. 

(i) We have g*(a\) G (0,1) and aj = %*(aj)) G (0,oo). 

(ii) The function g* is right continuous, and there is a finite set V> p C (0, oo), with 
cr p = inf T> p , such that g* is continuous apart from jumps at points ofT> p . For each 
a (zT> p , h(g*(a)) = a is a local minimum for h. 

(Hi) If a ^ T> p , then g*(a) satisfies the crossing condition (5.6). 



36 



Proof. Since F[W > 2] = 1 and E[W] < oo, we have <p a (l) < 1 and <p a (0) = 0. Therefore 



by continuity we may rewrite (5.5) as 



9W 



sup{s G (0, 1) : ip a (s) > s}, 



using the convention sup0 = 0. By the definition (2.7) of h, for s G (0, 1) it is easy to 
check that (p a (s) > s if and only if h(s) < a, and this shows that ( |2.9 ) and (5.5) give 
equivalent definitions of g*(a). 

By (2.7) and subsequent remark s, h (x) is positive, continuous in x, and tends to 
infinity as x t 1. By the definition ( |2.8| ), and the subsequent remark, a p G [0,1]. By 
the definition ( 2.9| , it is clear that g (a) = for a G [0,aj!,), and the fact that g*{a) is 



positive and strictly increasing for a G (a p , oo) is easily deduced from the continuity of 
h. Also, given e G (0, 1) we can choose a with h(l — e) < a so that g*(a) > 1 — e, and 
together with the monotonicity of g* this shows g*(a) — > 1 as a — > oo. 

For part (i), under the extra assumption F[W > 3] = 1 we have h going to infinity at 
and at 1, and by continuity h attains its infimum on (0, 1), so using (2.8) and (2.9) we 
have that g*(a P ) is the supremum of a non-empty compact set contained in (0, 1), and so 
lies in (0,1). The last part of (i) also follows from the continuity of h. 

For part (ii), under the extra assumption E[W 2 ] < oo, note first that if < y < a p 
then g*(y) = 0. Hence g* is continuous at y for all y < a p . 

Now let y > a p ; note that by (2.9) and continuity of h, we have h(g*(y)) = y. Take a 
monotonic sequence y n tending to y; set x n = g*(y n ). 

Suppose first that y n \, y. Then the sequence x n is nonincreasing; denoting the limit 
by Xoo we have h(x n ) = y n so h^x^) = y by continuity, and therefore Xoo < g*(y) by (2.9). 
Since also x n > g*(y) by monotonicity we have x^ = g*{y); hence g* is right-continuous 
at y. 

Now suppose instead that y n j" y. Set x = g*(y). If h does not have a local minimum 
at x then liminf g*(y n ) > x, so that and hence g* is left-continuous at y. Hence, 

if g* is discontinuous at y then h has a local minimum at g*(y). 

The function h! is analytic and non-constant on (0, 1) so its zeros do not accumulate 
except possibly at or 1. However h'(x) = implies p'(x)/p"(x) = —(1 — x) log(l — x), 
so by the assumption E[W 2 ] < oo there exists e > such that h'(x) ^ for 1 — e < x < 1 
and for < x < e; for the latter case we use the fact that, as x 4 0, 



p'{x) 



1 — x) log(l — x) — > r — 1 > 1, 



if ro > 3 is the smallest possible value of W. Thus h has only finitely many local minima 
in (0, 1), and hence h has a local minimum at g*(y) for at most finitely many y. This 
completes the proof of (ii). 

For part (iii) note that, for s G (0, 1), (p a (s) > s if and only if h(s) < a, so (5.6) gives 
g*(a) = sup|s G (0, 1) : h(s) < a|, which for a V n agrees with the definition (|2.9|). □ 



sup{s G (0, 1) : h(s) < a}, which for a ^ V p agrees with the definition (2.9). 
To apply the results in Section 



is zero or satisfies (5.6) 
a G (0, oo), a <£ V p . 



Lemma 



5.2 



5.5 



By Theorem 5.2[ ii) and (5.9) 
the 2-core converges a.s. to 



n 



to M(n,m n ), we need to assume that g*(a) either 
shows that a sufficient condition for this is that 

times the number of incidences which survive in 



M*a(g*) = {ap'{l))g 



ag p {g 



-g*log(l-g* 



(5.10) 
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By Theorem 5.2 iii) and (5.8), for d > 2 the proportion of original vertices whose degree 
in the 2-core is d is asymptotically 



fc>d 



•-'£r(5Vw-)M -*»•>)' 



J! 



d! ' 



(5.11; 



the remainder having degree in the 2-core (some columns in the core have degree zero 



because in the algorithm of Section 5.1, we never delete any columns). In other words, 

)f a random variable D1{D ^ 1}, where 
As a check on the previous calculation 



the 2-core vertex degrees have the distribution of a random variable D1{D ^ 1}, where 

ap'{g* 

1 times the total number of incidences 



D ~ Po(/ic%*)); by (5.10), p,a{g 



(5.10) of the number of surviving incidences, 



n 



in the 2-core should converge to the mean of the vertex-degree distribution, which is 



ap'(g*)(l - e - ap '^) = ag*p'(g*), as in (5.10). 



The key issue in predicting hypercycles and rank deficiency is whether the number of 
rows in the 2-core exceeds the number of occupied columns in the 2-core, which is treated 



and ol p from (2.10). 



in Theorem 5.6 a related result appears in |8j. Recall the definition of ip(g) from (2.6) 



Theorem 5.6. Suppose W n are uniformly bounded and F[W > 3] = 1. Let a G (0, oo). 
Consider the 2-core of the random incidence matrix M(n,m n ) where m n /n — > a as 
n — > oo. Then if a < a p , the number of rows in the 2-core is o(n), a.s. 

Now suppose a > a p , so g* = g*(a) > 0, and suppose that a ^ V p . Then: 

(i) n~ x times the number of rows in the 2-core converges a.s. to ap(g*). 

(ii) n~ x times the number of occupied columns in the 2-core converges a.s. to 1 — e - ^(l + 
v), where v := ap'(g*). 

(iii) Almost surely, for all n large enough, the 2-core has more rows than occupied 
columns if ip(g*(ot)) < but has fewer rows than occupied columns if i/j(g*(a)) > 0. 
Moreover, there exists 5 > such that if a G {a p ,a p + 5), for all n large enough, 
the 2-core has more rows than columns and so the corresponding hypergraph has a 
hypercycle. 

Figure [3] shows an example of some of the more exotic behvaiour that can occur in the 
random weight setting. In the case where p(s) = 0.9183s 3 + 0.04s 19 + 0.0417s 41 , tp(g*(a)) 
changes sign several times, and so Theorem |5.6 shows that as a increases from the 
2-core switches from having asymptotically more columns than rows to having more rows 
than columns not just once (at a p ), but twice. This non-monotone behaviour does not 



occur in the fixed weight case. Proposition |5.8| below, and the subsequent discussion, 
explains some of the features in the figure. 



Proof of Theorem \5.6] By Corollary |5.4[ a .s. we have a sequence of random matrices 

If a < at 



satisfying the hypotheses of Theorem 



5.2 



5.2 



so 



9 



,,, then g*(a) = 0, and Theorem 
shows that the 2-core has o(n) rows. So from now on suppose that a > a 
-- 9 

an rows, a proportion p(g*) survives 



a) > (see Lemma 5.5). 
For the statement 



i), note that out of m n 



by Theorem 5.2 ii). For (ii), the discussion around (5.11) implies that the proportion of 
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Figure 3: Example with p(s) = 0.9183s 3 + 0.04s 19 + 0.0417s 41 . The left plot shows parts 
of the curves y = h(x) (all the line) and x = g*(y) (solid line). The right plot shows 
parts of the curves y = ip{x) (dashed line) and the locus of (g*(a), ip(g*(a))) (solid line). 
Again g*(a) has two discontinuities, one at a = a p ~ 0.890061 and one at a ~ 0.991044, 
with the first corresponding to a jump from g* = to g* « 0.720793 and the second to 
a jump from g* w 0.929269 to g* w 0.973325. The right plot shows the three positive 
roots of ip(x) = 0. The two of these roots achieved by ip{g*{a)) are at x ~ 0.928538 
and x ~ 0.975069. The first corresponds to a = a p ~ 0.990686 and the second to 
a ~ 0.991185. Hence as a ranges in (0,1), ip{g*{a)) changes sign from positive, to 
negative, to positive, and finally to negative again. 



the n original vertices whose degree in the 2-core is non-zero is obtained by subtracting 
from 1 the mass that a Po(z/) random variable places on {0, 1}. 

For (iii), we compare the limits in (i) and (ii). Suppose that these limits satisfy 



a p(g*)>l-e-^'^\l + ap'(g*)). 



(5.12) 



By our assumptions on a and W, we have p(0) = p'(0) = and g* > 0, which implies 
that p(g*) and p'(g*) are both positive. Then we may rewrite ( |5.12 ) as 



ap(g*) >l-(l-g*)(l + ap'(g*)) 

= 0* + (i - 0*) io g (i - A 

using the definition of g*. Now substituting in a = h(g*(a)) for a on the left-hand side 
of the last display (given p'(g*) > 0) we may rewrite the last inequality as ip(g*) < 0, 

> is equivalent to 



where ip is defined by (2.6). Similarly, ip(g* 

ap(g*)<l-e- a ^*\l + ap'(g*)). 



(5.13) 



If Tp(g*) < 0, then (5.12) holds and the limit in (i) in strictly greater than the limit in 



(ii), which shows that the 2-core eventually has more rows than occupied columns, and 
vice versa if ip(g*) > (so that ( 5.13[ ) holds). 



Recall the definition of g p from (2.10). We know from Lemma |5.5[ iii) that g* has 
only finitely many discontinuities. Either a p is a continuity point for g* (a) (and hence 
for ip(g*(a))), or else a p G V p with ip(g*(a p )) < and no other point of V p is in a 
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neighbourhood of g p . In either case, ip(g*(a)) < for a in an interval of the form 
[g p ,g p + 5) with 5 > 0. The final statement in the theorem, about the existence of a 
hypercycle, follows from Lemma 5.1[ iii). □ 



We next state a result giving an upper bound for a p . 
Proposition 5.7. Suppose that F[W > 3] = 1 and E[W 2 } < oo. Then g p < 1. 



Proof. We know from Lemma 5.5 that or < 1, so if a p < a p there is nothing to prove. 
Hence we assume g p > a p from now on. First we show that 

for any e > there exists a G (a p — e,a p ), such that ip(g*(a)) > 0. (5-14) 



By the definition (|2.10) of a p , and the assumption g p > at, if (5.14) fails then there 



exists S > such that ip o g* is identically zero on the interval / := (a p — 5,a p ), and by 
taking 6 small enough we may assume the interval / contains no discontinuities of g*. But 
then the image J := g*(I) is also an open interval because g* is continuous and strictly 
increasing on /. So we would then have ip identically zero on J, which would contradict 



the fact that ip is analytic and non-constant on (0, 1). Thus (5.14) must hold as asserted 



Observe next that every time the 2-core algorithm deletes a row, it has to create at 
least one column of degree zero, and possibly more. So the aspect ratio (i.e., number 
of rows divided by number of occupied columns) is nondecreasing at each step of the 
algorithm, provided the initial aspect ratio is at least 1. Hence the aspect ratio of a 
non-empty 2-core is at least as large as the aspect ratio of the original incidence matrix 
to which the algorithm is applied, provided the latter is at least 1. 

So if m n /n — > a > 1, then the aspect ratio of the original matrix exceeds 1 for all n 
large enough, and hence so does the aspect ratio of the 2-core, assuming it exists. Suppose 
that g p > 1. Then by (5.14) and the finiteness of V p , there exists a' G (1, g p )\V p such that 
ip(g*(a')) > 0. Then, by Theorem 5.6[ iii), with m n /n — )• a = a', the 2-core has aspect 
ratio less than 1 for all n large enough, which contradicts the previous conclusion that 
a > 1 implied the 2-core having limiting aspect ratio greater than 1. Hence g p < 1. □ 

Next we give more information on the key functions h and tp, which should clarify the 
situation in Theorem 5.6 iii). By a root of ip, we mean any number x with if)(x) 



0. 



Then < a\ < g p < 1. 



Proposition 5.8. Suppose that F[W > 3] = 1 and E[W 2 ] < oo. 
The function ip has at least one root in (0, 1), and h has at least one local minimum in 
(0, 1). Suppose that the following condition holds: 



(a) h has a single local minimum x p in (0, 1), with h(x p ] 



inf 



ze(o,i) 



h(x) 



Then x p is the location of the unique local maximum of ip in (0,1), ij){x p ) > 0, and 
the interval (0,1) contains exactly one root of ip, denoted x* p , which satisfies x* < x*. 
Moreover, g p = h(x*) > (A, and 



< for all a > g p . 



(5.15) 



Finally, in the fixed row-weight case where where W = r > 3 a.s., condition (a) holds, 
and the unique positive root of ip is x* r G (^rf , 1)- 
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An important observation that helps to explain the close connection between the 
functions h and ip (apparent in Figure [TJ for example) and will also form an ingredient 
in the proof of Proposition ^8 is the following result. 

Lemma 5.9. For all x £ (0, 1), ip'{x) has the same sign as —h'(x), so, in particular, the 
locations of the local minima of h correspond exactly to the locations of the local maxima 
of ip in (0, 1). Finally, ifE[W 2 ] < oo, then as x j. we have 



ip{l 



x] 



h(l-x)-x- + o(l))x 2 log* 

h(l — x) — x + o(x). 



Proof. Differentiating (2.6), we obtain 

p(x) 



4>'(x) 



p'(x) 



X 



p\x) 



X 



(5.16) 



(5.17) 



On the other hand, from (2.7), we have that, for x £ (0, 1), 

p"(x) 



h'(x) 



p'{x) 



+ 



X 



p'(x) 



log(l 



p(x[ 



-ip'{x) 



by comparison with (5.17). Finally, (5.16) follows from a routine calculation. 



□ 



Before completing the proof of Proposition 5^ we make some further remarks and 
present some examples. The main complication in the interpretation of Theorem 5.6[ iii) 
is due to the fact that g* has discontinuities, so {ip(g*(oi)) : a > 0} is only a subset of 
{ip(x) : x £ [0, 1)}. Let 

Q p :={g*(a):a>al}. 



By Lemma 



5.5 



v ■ ■ • • • - -p) 

n), Q p is a union of finitely many intervals Q p = [gi,gt) U • • • U [gj , gf) 
where g{ < < g 2 < ■ ■ ■ < g\ ', and, for each k, g^ = g*(a) for a £ T> p , and h(g^) is 
a local minimum. Recall that a = h(g*(a)) and a i— > g*(az) is increasing for a > a p (see 
Lemma 5.5), so x \-> h(x) must be increasing on Q p . So in fact a p = h(g^) < ■ ■ ■ < h{gj). 
The 'curve i/j(g*(a)), a > a p is then a (discontinuous) trace of i[)(x), where x runs over 
Qp, piecewise continuously on intervals starting at g^ which, by Lemma 5.9, correspond 
to local maxima of ip. Figures [T] and [3] give some illustrations of possible behaviour. 
Observe that ip(x) is not necessarily decreasing for all x £ Q p . 

Note that condition (a) in Proposition 5.8 is not necessary for the sharp transition 
property (5.15) to hold. Two other relevant conditions are: 

(b) ip has a single root in (0, 1); 

(c) the global minimum of h on (0, 1) is the rightmost local minimum. 

If F[W > 3] = 1 and E[W] < oo, then h(x) ->■ oo as x ->■ and as x ->• 1, so (a) ^> (c), 
while in the course of the proof of Proposition 5.8 below, we show that (a) =^> (b) as well. 
We mention 3 illustrative examples. 

• An example for which conditions (a) and (b) do not hold but (c) does is provided 
by p(s) = 0.9s 3 + 0.1s 38 , for which ip has 3 positive roots (see Figure El). 
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Figure 4: Plots of y — tjj(x) for p(s) = s 3 (left) and p(s) = 0.9s 3 + 0.1s 38 (right). In 
the first case the only positive root is x\ ~ 0.883414, while in the second case the 3 
positive roots are x\ ~ 0.901174, x 2 ~ 0.937414, and x% ~ 0.997979. For the case on 
the right cA « 0.872923 and g* (a p ) « 0.988192, and only the root x 3 exceeds this value. 



Proposition 5.8 gives a p ~ 0.917935 for the case on the left and a p ~ 0.998263 for the 



case on the right. 



An example for which condition (b) holds but conditions (a) and (c) do not is 
p(s) = 0.9s 3 + 0.1s 24 , for which g* has two discontinuities (see Figure [T]). 



An example in which none of (a), (b) or (c) holds and where (5.15) fails is provided 
by p(s) = 0.9183s 3 + 0.04s 19 + 0.0417s 41 (see Figure 



Proof of Proposition 5^8. First we show that if ¥[W > 3] = 1 and E[W] < 00, then ip 
has at least one root in (0, 1). So suppose that there exists an integer r > 3 for which 
F[W > r] = 1 and F[W = r] = p > 0. Then p(s) ~ ps r as s J, 0. From (5.17) we have 



ip"(x) = (l-x) 



-1 



2p(x)p"(x) 
p'(x) 2 
p"(x) | p(x)p"'(x) 
p'(x) 



p'(xy 



1 I - (1 -x 

2p(x)p"(x) 2 
p'(x) 3 



r 2 P( X ) 

p'(x) 
log(l — x) 



(5.U 



Taking x I in (5.17) and (5.18), using p <yk \x) ~ ^ k y px r k for k < 3, we obtain 



V'(0)=0; V"(0) 



r - 2 



>0, 



since r > 3. Hence ^(0) = is a local minimum, and ip(x) > for x > small enough. 
But ip(x) —> —00 as x t 1, so continuity implies that ip has at least one root in (0, 1). 

Consider the condition (a) in the proposition. Suppose that h has a unique local 
minimum located at x p e (0,1), so g p = h(x p ). Then by Lemma 5.9, ip has a unique 



local maximum at x p , and necessarily ip(x p ) > 0. By continuity (and RoUe's theorem) it 
follows that ip has exactly one root x* £ (x p , 1). So (a) =3- (b). Moreover, it follows that 
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i/j(x) > for x G (0,x*) and ^/>(x) < for x > x*. Hence in this case Q p = [x p , 1), and 



the claim (5.15) follows 



Finally, we show that if p(s) = s r for some r > 3, then (a) holds. In the case p(s) = s r 



we obtain (cf (5.18)) 



ip"(x) 



r 1 



x 



1 



X 



Hence ^"(0) = r -y L > and for x G (0, 1) we have ip"(x) = if and only if x = ^| (an 
inflexion point). So, by Rolle's theorem, ip'(x) = for at most one x G (0, 1), necessarily 

'r-2 



x G (^5f,l), and this must be a local maximum for ip since ?/>(x) — >■ — oo as x t 1. 
Another application of Rolle's theorem shows that ip has a single positive root, x* say, 
which must be in , !)• By Lemma 5.9, the fact that ip has a single local maximum 
also shows that h has a single local minimum in (0,1), which is lies in □ 



Now we can complete the proof of Theorem |2.3| 



Proof of Theorem 2.3[ The expected number E Pn [A/"(n, m)] of null vectors is at least one, 
and may be large even when P Pn [T n < an] is small. Nevertheless we can derive bounds 
on T n /n by studying the asymptotics of E Pn [J\f(n, m)] because 



P p „[T n <m}= F pn [M(n,m) > 2] < E Pn [Af(n,m)] - 1, 
by Markov's inequality applied to the nonnegative random variable A^(n,m) — 1. 



(5.19) 



Suppose that m n /n a G (0,a*). Then by (|5.19|) with (|2~4|), P Pn [T„ < m r 
Oin- 1 ). It follows that, for any s > 0, P Pn [T n < 



a; 



e)n] — > 0. On the other hand, 
Theorem 5.6[ iii) implies that there exists S > such that for any a G (a p ,a p + S), 
Pp n pn < «] — t- 1. Moreover, these results together show that a* < a p , and a p < 1 by 
Proposition |5.7| □ 



To conclude this section, we give the proof of Proposition 2.8 



Proof of Proposition 2.8. Take p(s) 
for a* is in J5]. Fix a > 0. Then, 



^(1 _ e -^/ 2 ) 



s r for r > 3. As already mentioned, the asymptotic 
log( e - ar / 2 ) 



a 



r(l - e -W2)r-] 



1 + ^(1)), 



as r — > oo. Hence h(l — e ar / 2 ) < a for all r sufficiently large, which by (2.8) shows that 
limsup^^ a\ < a. Since a > was arbitrary, it follows that lim^oo a\ = 0. 



Finally, by Proposition 5.8, (2.18) holds. Then with (2.22) and repeated Taylor ex- 
pansions we obtain 



a. 



log(e 



- r + r 2 e- 2r + 0(r 4 e- 3r )) 



r(l - e~ r - r 2 e~ 2r + 0(r 4 e" 3r )) r " 
log(l + r 2 e- r + 0(r 4 e- 2r )) 



1 



(1 - e~ r - r 2 e~ 2r + 0(r 4 e- 3r )) r - 1 
(1 - re" r + 0(rV 2r )) (l + (r 
1 -e" r + 0(rV 2r ), 



l)e" r + 0(r 3 e" 2r )) 



completing the proof of (2.21). 



□ 
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6 Technical appendix 



6.1 Parity of random variables 

The following simple result, on the probability that certain integer- valued random vari- 
ables take even values, will be used several times. The formula in Lemma 6.1[ i) for the 



probability that a binomial variable is even may be found for example in 16 , pp. 277-278]. 



Lemma 6.1. Let X be a 7j + -valued random variable with probability generating function 
(f)(s) := E[s x ]. Then F[X G 2Z] = §(1 + 0(-l)). In particular (i) if X ~ Bin(n,p), then 
F[X G 2Z] = (1 + (1 - 2p) n )/2; and (ii) if X ~ Po(fi), then F[X G 2Z] = e"' 1 cosh/z. 

Proo/. For X G Z + , 1{X G 2Z} = |(1 + (-l) x ), yielding the first statement in the 
lemma. For the Bin(n,p) case, 0(s) = (ps + 1 — p) n , giving part (i), while in the Po(/x) 
case, 0(s) = e^" 1 ), which gives PLY G 2Z] = (1 + e- 2 ^)/2 = e^cosh/i. □ 

6.2 Parity of multinomial random variables 



We saw in Lemma 6.1 that the probability that a Bin(n,p) random variable is even is 
(1 + (1— 2p) n )/2. In this section we extend this formula to a more complicated multinomial 
setting and more general congruence conditions modulo r. 

Here is our probabilistic model. We perform a sequence of n independent trials. Each 
trial is probabilistically identical, and we are interested in the outcome of a trial described 
in terms of an arbitrary collection of k events Ai, . . . , A}.. Probabilities p(Ej) are specified 
for each of the 'elementary' events Ej defined as 

£ 7 := (n, e7 A,) n (Dj^iAj) . 

for each / C [k] (here [k] := {1, . . . , k}). We assume that p{Ej) > for each / and that 
J2ic[k]P(Ei) — 1- Use P n to denote the probability measure associated with the model 
consisting of n trials as just described. 

Let Ni be the number of occurrences of event Ai. For L C J C [fc] set 

Ej, L ■■= (n ieL Ai) n (n ieJ \ L Af) 

and set Ej := E^j. Also, set p (J) to be the probability for a single trial that an odd 
number of outcomes Ai,i G J occur, and note that 

l-2 Po (J) = J2(-V lLl P(Ej, L ), 

LCJ 

where \L\ denotes the number of elements of L. The next result gives a general formula 
for the probability in n trials that for each i the Ni falls into a particular congruence 
class modulo r, and a specialization to the event that Ni is even for each i in a specified 
subset / of [k] and N is odd for each i G [k] \ I. For positive integer r, define the complex 
number tu := e^ 27r//r ^ (a complex rth root of unity). 

Lemma 6.2. (i) Let r > 2 be an integer and t = (ii, . . . , %) G {0, 1, . . . , r — l} fc . Then 



Pn[ntiW = *i (mod r)}] =r- k ^ u;- th ^ u* h Pft ) , (6.1) 

he{o,i,...,r-i} fc \ge{o,i} fc 
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where for g = (g 1} . . . , g^) G {0, l} fc we set p s = p(E s ) with the event 

e s ■■= (n i6 {i i ... j A ; }; Si =iA) n (^ie{i,...,ky^=o A i) 

defined in terms of a single trial, and t ■ h := Yli=i ^ihi an( ^ § ' h := 5^i=i 9ihi- 
(ii) For any JC [k], 

p [(n< 6 j{iVi e 2Z}) n (n i6[fc]V r{iVj £ 2Z})] = 2~ fc E (-i)i Jn/ i(i - 2p (J)) n . (6.2) 

JC[fc] 

Proof. Part (ii) follows from the r = 2 case of part (i) on setting t{ to be 1 on / and 
otherwise. Thus we need to prove part (i). We obtain (6.1) by induction on n. First 
consider the case n = 0. In this case the left side of (6.1) is equal to 1 if t j = for all i 
and is equal to zero otherwise, and the right side is equal to 

r - k j2 u- th . 

he{0,l,...,r-l} fc 

If t = then this expression is 1. Otherwise, it is zero since if ti ^ for some i then 

1 , ,-rti 

E = = o. 



hi=0 



1 — io~ tl 



Thus the inductive hypothesis holds for n = 0. Suppose it holds for some n. In the case 
of n + 1 trials, conditioning on the outcome of the first trial we obtain 

P n+1 [n^ =1 {N t = U (mod r)}] = J] p g P n [n* = i{^< = U (mod r)}|£ g ] 

g£{0,l} fe 

= P^n + i [ntANi = U- 9i (mod r)}] 



ge{o,i} fc he{o,l,...,r-i}* 



r- k £ W 
he{o,iv-A-i} fc 



th 



E ^ 



vge{o,i} fe 



v fe{o,i} fe 

E 

v fe{o,i} fc 

n+l 



r- fc E u ~ 

he{0,l,...,r-l} fc 



th 



E ^ 



vgG{0,l} fc 



which completes the induction. 



□ 



6.3 Generating function properties 

The next result collects some elementary properties of probability generation functions. 

Lemma 6.3. Let 0(s) := E[s x ], s G [—1, 1], for a Z* + -valued random variable X. Then 
0(0) = F[X = 0], 0(1) = 1, and 0(s) is infinitely differ entiable at least for s G (—1, 1); if 
K[X] < oo then (f>'(s) = j-0(s) is continuous in the closed interval [—1, 1]. Moreover, 
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(i) Suppose that F[X = 0] = 0. Then as s 4- 0, 

4>(s) = s¥[X = 1] + 0(s 2 ), and 0'(s) = F[X = 1] + O(s). 

(wj 7/E[X] < oo ; then as s J, 0, 

0(1 - s ) = 1 - S E[X] + o(s), and 0'(1 - s) = E[X] + o(l). 

(m,) -For any s G [0,1], |0(— s)| < 0(s). 

Proof. Apart perhaps from part (iii), all of the properties stated in the lemma are well 



known: see for example 16, pp. 264-266]. For part (iii), let s G [0,1]. Then 

\^-s)\<E[\(-s) x \}=E[s x } = ^s). 



□ 



6.4 Asymptotic estimates 

We shall use the following bounds on the binomial coefficient m . 
Lemma 6.4. Let n G N and k G {0, 1, . . . , n}. Then 



n 



< 



n J 



k/n 



1 - 



n 



l-(k/n) s 



On the other hand, if < k < n, 



n 



> 



n 



2nk{n — k) 



1/2 



-1/6 



n 



k/n 



< n h e k k k . 



n — k \ ( n-fc )/" N 



n 



(6.3) 



(6.4) 



Proof. We apply Robbins's refinement of Stirling's formula (see e.g. 16, §11.9]), which 
says that for any n> 1, 

n\ = {2Ti) l ' 2 n n+{l ' 2 h- n+£ \ 
where 12 n+i < £ n < j^- This yields the upper bound, for n > 1 and k, n — k > 1, 



n 



< 



n 



Jt J \2i:k(n — k) 

where we have used the fact that 

1 

^■n £-k ^n—k — 



1/2 



k/n 



n — k 



n 



(n—k)/n^ 



(6.5) 



12n + 2 



< 



1 



12n + 2 



12n 144A;(n — k) + Yin + 1 ~ 12n 36n 2 + Yin + 1 



<0, 



since fc(n — k) < n 2 /4. By considering separately the cases (i) G {0,n}, and (ii) 
< k < n, using (6.5) in case (ii), we obtain the first inequality in ( 6.3[ ). The second 
inequality in (6.3) follows from the fact that 



n 



-(n-k) 



k 



n — k 



n—k 



<e A 



For the lower bound, another application of Robbins's bounds yields (6.4), where for the 
e -1 / 6 term we have used the fact that £ n — £k~ £ n -k > — rF ~~ ^ 1 ■ ' > 



12fc 12(n-fc) 



□ 
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